Robust Header Compression R. Finking Internet-Draft Siemens/Roke Manor Expires: August 25, 2005 C. Bormann Universitaet Bremen TZI G. Pelletier Ericsson AB February 21, 2005 Formal Notation for Robust Header Compression (ROHC-FN) draft-ietf-rohc-formal-notation-05.txt Status of this Memo This document is an Internet-Draft and is subject to all provisions of Section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she become aware will be disclosed, in accordance with RFC 3668. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 25, 2005. Copyright Notice Copyright (C) The Internet Society (2005). Abstract This document defines ROHC-FN: a formal notation to unambiguously specify header compression field encodings, when defining new profiles within the ROHC (RFC3095 [4]) framework. ROHC-FN offers a Finking, et al. Expires August 25, 2005 [Page 1] Internet-Draft ROHC-FN February 2005 library of encoding methods that are often used in ROHC profiles, and can thereby help simplifying future profile development work. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Overview of ROHC-FN . . . . . . . . . . . . . . . . . . . . . 5 3.1 Scope of ROHC-FN . . . . . . . . . . . . . . . . . . . . . 5 3.2 Fundamentals of ROHC-FN . . . . . . . . . . . . . . . . . 6 3.2.1 Fields and Encodings . . . . . . . . . . . . . . . . . 6 3.2.2 Structures . . . . . . . . . . . . . . . . . . . . . . 7 3.3 Example using IPv4 . . . . . . . . . . . . . . . . . . . . 9 4. Normative Definition of ROHC-FN . . . . . . . . . . . . . . . 12 4.1 Overall Structure of a Specification . . . . . . . . . . . 12 4.2 Constant Definitions . . . . . . . . . . . . . . . . . . . 13 4.3 Attributes . . . . . . . . . . . . . . . . . . . . . . . . 13 4.3.1 Attribute References . . . . . . . . . . . . . . . . . 14 4.4 Expressions . . . . . . . . . . . . . . . . . . . . . . . 14 4.4.1 Integer Literals . . . . . . . . . . . . . . . . . . . 15 4.4.2 Boolean Literals . . . . . . . . . . . . . . . . . . . 15 4.4.3 Boolean Operators . . . . . . . . . . . . . . . . . . 15 4.4.4 Integer Operators . . . . . . . . . . . . . . . . . . 16 4.4.5 Comparison Operators . . . . . . . . . . . . . . . . . 16 4.5 let Statements . . . . . . . . . . . . . . . . . . . . . . 16 4.6 Comments . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.6.1 End of line comments . . . . . . . . . . . . . . . . . 17 4.6.2 Block comments . . . . . . . . . . . . . . . . . . . . 18 4.7 Library Encoding Methods . . . . . . . . . . . . . . . . . 18 4.7.1 uncompressed_value . . . . . . . . . . . . . . . . . . 18 4.7.2 compressed_value . . . . . . . . . . . . . . . . . . . 19 4.7.3 irregular . . . . . . . . . . . . . . . . . . . . . . 20 4.7.4 static . . . . . . . . . . . . . . . . . . . . . . . . 20 4.7.5 lsb . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.7.6 crc . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.8 Profile-specific Encoding Methods . . . . . . . . . . . . 23 4.9 Structures . . . . . . . . . . . . . . . . . . . . . . . . 23 4.9.1 "this" . . . . . . . . . . . . . . . . . . . . . . . . 23 4.9.2 Simple Structures . . . . . . . . . . . . . . . . . . 23 4.9.3 Arguments and Structures . . . . . . . . . . . . . . . 26 4.9.4 Multiple Formats . . . . . . . . . . . . . . . . . . . 27 4.9.5 Control Fields . . . . . . . . . . . . . . . . . . . . 30 5. Security considerations . . . . . . . . . . . . . . . . . . . 32 6. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 32 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 32 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 33 A. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Finking, et al. Expires August 25, 2005 [Page 2] Internet-Draft ROHC-FN February 2005 A.1 Reserved Keywords . . . . . . . . . . . . . . . . . . . . 33 A.2 Characters . . . . . . . . . . . . . . . . . . . . . . . . 34 A.3 Literals . . . . . . . . . . . . . . . . . . . . . . . . . 36 A.4 Identifiers . . . . . . . . . . . . . . . . . . . . . . . 36 A.5 Operators . . . . . . . . . . . . . . . . . . . . . . . . 36 A.6 Expressions . . . . . . . . . . . . . . . . . . . . . . . 37 A.7 Constants . . . . . . . . . . . . . . . . . . . . . . . . 37 A.8 Field Names . . . . . . . . . . . . . . . . . . . . . . . 37 A.9 Attributes . . . . . . . . . . . . . . . . . . . . . . . . 37 A.10 Encoding Methods . . . . . . . . . . . . . . . . . . . . . 38 A.11 Structures . . . . . . . . . . . . . . . . . . . . . . . . 38 B. Bit-level Worked Example . . . . . . . . . . . . . . . . . . . 39 B.1 Example Packet Format . . . . . . . . . . . . . . . . . . 39 B.2 Initial Encoding . . . . . . . . . . . . . . . . . . . . . 39 B.3 Basic Compression . . . . . . . . . . . . . . . . . . . . 40 B.4 Inter-packet compression . . . . . . . . . . . . . . . . . 42 B.5 Variable Length Discriminators . . . . . . . . . . . . . . 45 B.6 Default encoding . . . . . . . . . . . . . . . . . . . . . 48 B.7 Control Fields . . . . . . . . . . . . . . . . . . . . . . 49 Intellectual Property and Copyright Statements . . . . . . . . 52 Finking, et al. Expires August 25, 2005 [Page 3] Internet-Draft ROHC-FN February 2005 1. Introduction ROHC-FN is a formal notation designed to help with the definition of ROHC (RFC3095 [4]) header compression profiles. ROHC-FN offers a library of encoding methods that are often used in ROHC profiles, so new profiles can be specified without the need to redefine this library from scratch. Informally, an encoding method is a function that maps between uncompressed data and compressed data. The simplest encoding methods only have one input and one output: the input is an uncompressed field and the output is the compressed version of the field. More complex encoding methods can compress multiple fields at the same time, e.g. "list" encoding from RFC3095 [4], which is designed to compress an ordered list of fields. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 [2]. o Profile A ROHC (RFC 3095 [4]) profile is a description of how to compress a certain protocol stack over a certain type of link. Each profile includes packet formats which describe how to compress the headers and a state machine to control the actions of each endpoint. o Field ROHC-FN divides the protocol header to be compressed into a set of contiguous bit patterns known as fields. o Control field Control fields are transmitted from a ROHC compressor to a ROHC decompressor, but are not part of the uncompressed header itself. o Encoding method Encoding methods are functions that can be applied to compress fields in a protocol header. o Library of encoding methods Finking, et al. Expires August 25, 2005 [Page 4] Internet-Draft ROHC-FN February 2005 The library of encoding methods contains a number of commonly used encoding methods for compressing header fields. 3. Overview of ROHC-FN This section gives an overview of ROHC-FN. It also explains how ROHC-FN can be used to specify the compression of header fields as part of a ROHC profile. 3.1 Scope of ROHC-FN This section describes the scope of the ROHC-FN. It explains how the formal notation relates to the ROHC framework and to specific ROHC profiles. The ROHC framework provides the general principles for performing ROHC compression. It defines the concept of a profile, which makes ROHC a general platform for different compression schemes. It sets link layer requirements, and in particular negotiation requirements for all ROHC profiles. It defines a set of common functions such as Context Identifiers (CIDs), padding and segmentation. It also defines common packet formats (IR, IR-DYN, Feedback, Short-CID expander, etc.), and finally it defines a generic, profile independent, feedback mechanism. A ROHC profile is a description of how to compress a certain protocol stack over a certain type of link. For example, ROHC profiles are available for RTP/UDP/IP and many other protocol stacks. Each ROHC profile contains the following two components: 1. Packet formats, for compressing and decompressing headers; and 2. State machine, for maintaining synchronisation of the context. The purpose of the packet formats is to define how to compress and decompress headers. The packet formats define one or more compressed versions of each uncompressed header; inversely, the packet formats define how to relate a compressed header back to the original uncompressed header. The packet formats will typically compress headers relative to a context of field values from previous headers in a flow. This improves the overall compression ratio, because this takes into account redundancies between headers of successive packets. The purpose of the state machine is to ensure that the profile is robust against bit errors and dropped packets. The state machine manages the context, providing feedback and other mechanisms to Finking, et al. Expires August 25, 2005 [Page 5] Internet-Draft ROHC-FN February 2005 ensure that the compressor and decompressor contexts are kept synchronised. The ROHC-FN is designed to help in the specification of the packet formats used in ROHC profiles. It offers a library of encoding methods for compressing fields, and a mechanism for combining these encoding methods to create packet formats tailored to a specific protocol stack. The state machine for the profiles is beyond the scope of ROHC-FN, and it must be provided separately as part of a complete profile specification. 3.2 Fundamentals of ROHC-FN There are two fundamental elements to the formal notation: 1. Fields and their encodings, which define the mapping between a field's uncompressed and compressed values. 2. Structures, which define lists of uncompressed fields and the lists of compressed fields they map onto. These two fundamental elements are at the core of the notation and are outlined below. 3.2.1 Fields and Encodings The creation of bindings between fields and encoding methods is indicated as follows: field ::= encoding_method When writing the above statement, the symbol "::=" means "is encoded as". This statement does not represent an assignment operation from the right hand side to the left side. Instead, it is a two-way mapping in that it both represents the compression and the decompression operation in a single statement, where variables take on values through a process of two-way matching. Two-way matching is a binary operation that attempts to make the operands the same (similar to the unification process in logic). The operands represent one unspecified data object, and values can be matched from either operand. Fields have attributes. Attributes describe various things about the field, including the length of the field and whereabouts the field appears in the header. For example: field:has_context indicates whether or not a context entry exists for this field. Finking, et al. Expires August 25, 2005 [Page 6] Internet-Draft ROHC-FN February 2005 See Section 4.3 for more details on field attributes. An encoding method (including the parameters specified with the method) creates a reversible binding between the attributes of a field . At the compressor, a packet format can be used if a set of bindings that is successful for all fields can be found. At the decompressor, the operation is reversed using the same bindings and the fields are filled according to the specified bindings. For example, the 'static' encoding method creates a binding between the attribute corresponding to the uncompressed value of the field and the attribute corresponding to the value of the field in the context. o For the compressor, the 'static' binding is successful when both the context value and the uncompressed value are the same. If the two values differ then the binding fails. o For the decompressor, the 'static' binding succeeds for a packet type only if a valid context entry containing the value of the uncompressed field exists. Otherwise, the binding will fail and an alternative encoding method must be used. 3.2.2 Structures Structures provide a mechanism for combining fields and their encoding methods into larger units. Structures are defined using the "===" symbol. These can then be used as encoding methods in other structures: structure === { uc_format = field_1, field_2, : : field_n; control_fields = ctrl_field_1, ctrl_field_2, : : ctrl_field_n; default_methods = { field_a ::= encoding_method_9; field_e ::= encoding_method_8; : : Finking, et al. Expires August 25, 2005 [Page 7] Internet-Draft ROHC-FN February 2005 : : ctrl_field_3 ::= encoding_method_2; }; co_format_0 = field_a, : : field_b { field_a ::= encoding_method_1; : : : : field_b ::= encoding_method_2; ctrl_field_1 ::= encoding_method_3 }; co_format_1 = field_c, : : field_d { field_c ::= encoding_method_4; : : : : field_d ::= encoding_method_5; }; : : co_format_n = field_y, : : field_z { field_y ::= encoding_method_foo; : : : : field_z ::= encoding_method_bar; }; }; In the example above, the comma separated list "uc_format" indicates the order of fields in the uncompressed header. After this is another comma separated list, "control_fields", which defines one or more control fields. Finally, a number of packet formats for the compressed data follow, each beginning with the reserved prefix "co_format". These also have a field order list, which consists of: o fields that occur in the uncompressed header; or Finking, et al. Expires August 25, 2005 [Page 8] Internet-Draft ROHC-FN February 2005 o "control fields", that are additional information added to the compressed packet during compression. In the example packet formats defined by "co_format" also indicate an list of field encodings, which is typical usage. A "uc_format" may also include a field order list, though the one in the example doesn't. The field encodings list contains the encoding methods for each field. These are defined inside braces for the fields in the preceding field order list. Fields that have no encoding methods defined in this field order list are encoded using the default encodings specified in "default_methods" (see Section 4.9.4.3). Fields from the uncompressed header have the same name as they do in the compressed header. If there are any fields which are present exclusively in the compressed header but which do have an uncompressed value, they must be declared in the "control_fields" section of the structure (see Section 4.9.5 for more details on defining control fields). In the example above, all fields appearing in the compressed header are also found in the uncompressed field order list, or the control field list. However it is possible to have fields which appear in neither an uncompressed field order list nor the control field list. Fields which have no "uncompressed" value, such as a checksum on the compressed header, fall into this category. Following the compressed field order list, 3.3 Example using IPv4 This section gives an overview of how the notation is used by means of an example. The example will develop the formal notation for an encoding method capable of compressing a single, well-known header: the IPv4 header. The first step is to specify the overall structure of the IPv4 header. To do this, we use a structure which we will call "ipv4_header". Structures are defined in Section 4.9. This is notated as follows: ipv4_header === { The statement above defines the encoding method "ipv4_header" as a structure, the definition of which follows the opening brace. Definitions within the pair of braces are local to "ipv4_header". This scoping mechanism helps to clarify which fields belong to which headers: it is also useful when compressing complex protocol stacks Finking, et al. Expires August 25, 2005 [Page 9] Internet-Draft ROHC-FN February 2005 with several headers and fields, often sharing the same names. The next step is to specify the fields contained in the uncompressed IPv4 header. This is accomplished using ROHC-FN as follows: uc_format = version, % [ 4 ] header_length, % [ 4 ] tos, % [ 6 ] ecn, % [ 2 ] length, % [ 16 ] id, % [ 16 ] reserved, % [ 1 ] dont_frag, % [ 1 ] more_fragments, % [ 1 ] offset, % [ 13 ] ttl, % [ 8 ] protocol, % [ 8 ] checksum, % [ 16 ] src_addr, % [ 32 ] dest_addr; % [ 32 ] The numbers in square brackets give the field width in bits. Note that these are mere comments that do not have any formal meaning. The fields contained in the compressed header can then be specified. Exactly what appears in this list of fields depends on the encoding methods used to encode the uncompressed fields -- it may be possible to compress certain fields down to 0 bits, in which case they do not need to be sent in the compressed header at all. co_format = src_addr, % [ 32 ] dest_addr, % [ 32 ] length, % [ 16 ] id, % [ 16 ] ttl, % [ 8 ] protocol, % [ 8 ] tos, % [ 6 ] ecn, % [ 2 ] dont_frag % [ 1 ] { Note that the order of the fields in the compressed header is independent of the order of the fields in the uncompressed header. The next step is to specify the encoding methods for each field in the IPv4 header. These are taken from encoding methods in the ROHC-FN library, as well as from additional encoding methods defined in the profile specification itself. Since the intention here is to Finking, et al. Expires August 25, 2005 [Page 10] Internet-Draft ROHC-FN February 2005 illustrate the use of the notation, rather than to describe the optimum method of compressing IPv4 headers, this example uses only three predefined encoding methods. The "uncompressed_value" encoding method (defined in Section 4.7.1) can compress any field whose uncompressed length and value are fixed. No compressed bits need to be sent because the uncompressed field can be reconstructed using its known size and value. The "uncompressed_value" encoding method is used to compress five fields in the IPv4 header, as described below: version ::= uncompressed_value (4, 4); header_length ::= uncompressed_value (4, 5); reserved ::= uncompressed_value (1, 0); more_fragments ::= uncompressed_value (1, 0); offset ::= uncompressed_value (13, 0); The first parameter indicates the length of the uncompressed field in bits, and the second parameter gives its integer value. The "irregular" encoding method (defined in Section 4.7.3) can be used to encode any field whose length is fixed, or can be calculated using an expression. It is a general encoding method that can be used for fields to which no other encoding method applies. All of the bits in the uncompressed field are present in the compressed format as well; hence this encoding does not achieve any compression. tos ::= irregular (6); ecn ::= irregular (2); length ::= irregular (16); id ::= irregular (16); dont_frag ::= irregular (1); ttl ::= irregular (8); protocol ::= irregular (8); src_addr ::= irregular (32); dest_addr ::= irregular (32); Finally, the third encoding method is specific only to IPv4 headers, "inferred_ip_v4_header_checksum": checksum ::= inferred_ip_v4_header_checksum; }; This is a specific encoding method for calculating the IP checksum from the rest of the header values. Like the "uncompressed_value" encoding method, no compressed bits need to be sent, since the field value can be reconstructed at the decompressor. Finking, et al. Expires August 25, 2005 [Page 11] Internet-Draft ROHC-FN February 2005 However, unlike "uncompressed_value", the meaning of "inferred_ip_v4_header_checksum" is not defined in the ROHC-FN library of encoding methods, nor is it defined by another structure elsewhere in the formal notation given in the example above. Its definition can be given either using plain English text or using the formal notation as part of the profile definition itself. Finally the definition of the structure is closed with a closing brace. At this point, the above example has defined the format of the compressed IPv4 header, and provided enough information to allow an implementation to construct the compressed header from an uncompressed header and vice versa. 4. Normative Definition of ROHC-FN This section gives the normative definition of ROHC-FN. ROHC-FN is a referentially transparent, declarative language with no side effects. 4.1 Overall Structure of a Specification A ROHC-FN specification consists of a sequence of zero or more constant definitions (Section 4.2), an optional global control field list (Section 4.9.5) and one or more encoding method definitions, given in the form of structures (Section 4.9). Structures define an encoding method by giving one or more formats for uncompressed packets and one or more formats for compressed packets. These formats are linked by so-called fields, each of which describes a certain part of an uncompressed and/or a compressed format. The properties of a field are defined by defining an encoding method for it and/or by use of "let" statements. Encoding methods can be defined in FN using a structure or can be predefined encoding methods. Predefined encoding methods can be defined in the text accompanying a formal specification or they can be those defined in the present document. Each encoding method and each constant has an identifier. All of these identifiers have global scope. It is illegal to have multiple instances of the same identifier. It is also illegal to use any of the following as identifiers for encoding methods: o "let", "this" o "control_fields", "default_methods" o "uncomp_hdr_start", "uncomp_length", "uncomp_value" o "comp_hdr_start", "comp_length", "comp_value" Finking, et al. Expires August 25, 2005 [Page 12] Internet-Draft ROHC-FN February 2005 o identifiers starting either with "uc_format" or "co_format" 4.2 Constant Definitions Constant values can be defined using the "=" operator. Identifiers for constants must be all upper case. For example: SOME_CONSTANT = 3; Constants are defined by an expression on the right hand side of the "=" operator . The expression must yield a constant value. That is, the expression must be one whose terms are all either constants or literals and not structure parameters or field attributes (see Section 4.4). Constants have global scope. Constants must be defined at the top level, outside of any structure definition (noting that "=" has a different meaning inside a structure see Section 4.9). Because the FN is referentially transparent constants are entirely equivalent to the value they refer to and are completely interchangeable with that value. Similarly, since the language has no side effects a constant may never change its value. 4.3 Attributes In ROHC-FN, the properties of a field are defined by an encoding method. The encoding method's formal semantics are specified using a set of attributes. This set of attributes entirely characterises the relationship between the uncompressed and compressed representation of a field. Both of these representations are bit strings. The notation defines six attributes, three for the uncompressed field and a corresponding three for the compressed field. The attributes available for each field are as follows: uncompressed attributes of a field: o "uncomp_value", "uncomp_length" and "uncomp_hdr_start", compressed attributes of a field: o "comp_value", "comp_length" and "comp_hdr_start". The two value attributes contain the respective numerical values of the field, i.e. "uncomp_value" gives the numerical value of the uncompressed aspect of the field, and the attribute "comp_value" gives the numerical value of the compressed aspect of the field. The numerical values are derived by interpreting the bit string in the field as an unsigned binary number, most-significant bit first. Finking, et al. Expires August 25, 2005 [Page 13] Internet-Draft ROHC-FN February 2005 The two length attributes indicate the length in bits of the associated bit string; "uncomp_length" for the uncompressed representation, and "comp_length" for the compressed representation. Finally, the two "hdr_start" attributes indicate the offset in bits of the start of the field from the start of the header; "uncomp_hdr_start" for the position in the uncompressed header, and "comp_hdr_start" for the position of the field in the compressed header. Attributes are undefined unless they are bound to a value in which case they become defined. The defined value of an attribute can not be changed, bindings are permanent in the FN. Defined values are required for all compressed attributes of fields which appear in the compressed header and for all uncompressed attributes of fields which appear in it the uncompressed header. If two conflicting bindings are given for a field attribute then the binding fails along with the packet format in which the binding was defined. Note that uncompressed attributes do not always reflect an aspect of the uncompressed header. Some fields do not originate from the uncompressed header, but are control fields. In particular note that the "uncomp_hdr_start" attribute has no useful meaning if the field is a control field (see Section 4.9.5). 4.3.1 Attribute References Attributes of a particular field are referred to formally by using the field's name followed by a ":" and the attribute's identifier. For example: ip_id_behavior:uncomp_value gives the uncompressed value of the ip_id_behaviour field. 4.4 Expressions ROHC-FN includes the usual infix style of expressions, with parentheses "(" and ")" used for grouping. Expressions can be made up of any of the components described in the following subsections. In summary, the semantics of expressions are generally as in the C programming language, with the following additions and exceptions: o There is no limit on the range of integers. o For modulo, the expression "mod(k,v)" is used instead of C language "k % v". Note that the '%' is a comment character in Finking, et al. Expires August 25, 2005 [Page 14] Internet-Draft ROHC-FN February 2005 ROHC-FN. o "x ^ y" evaluates to x raised to the power of y. o "log2(w)" evaluates to the smallest integer k where w <= 2^k, i.e. it returns the smallest number of bits in which value v can be stored. Expressions may refer to any of the attributes of each field (as described in Section 4.3), and also to any defined constant (see Section 4.2). If any of the attributes or constants used in the expression are undefined, the value of the expression is undefined. Undefined expressions cause the environment (e.g. the packet format) in which they are used to fail if a defined value is required. Defined values are required for all compressed attributes of fields which appear in the compressed header and for all uncompressed attributes of fields which appear in it the uncompressed header. Note that expressions cannot be used as encoding methods directly because they do not completely characterise an field. Expressions only specify a single value whereas a field is made up of several values: its attributes. If for example the expression was used to define the uncompressed value of a field, the length of the uncompressed field would be undefined at the decompressor. For example, the following is illegal: tcp_list_length ::= (data_offset + 20) / 4; 4.4.1 Integer Literals Integers can be expressed as decimal values, binary values (prefixed by 0b), or hexadecimal values (prefixed by 0x). Negative integers are prefixed by a "-" sign (note that there is no unary minus operator). 4.4.2 Boolean Literals The boolean literals are "false", which has a value of 0, and "true", which has a value of 1. 4.4.3 Boolean Operators The following "boolean" operators are available, which take boolean arguments and return a boolean result: o &&, for logical "and". Returns true if both boolean1 and boolean2 are true. Returns false otherwise. Finking, et al. Expires August 25, 2005 [Page 15] Internet-Draft ROHC-FN February 2005 o ||, for logical "or". Returns true if at least one of boolean1 or boolean2 is true. Returns false otherwise. o !, for logical not. Returns true if boolean is false. Returns false otherwise. 4.4.4 Integer Operators The following "integer" operators are available, which take integer arguments and return an integer result: o ^, for exponentiation. "x ^ y" returns the value of "x" to the power of "y". o *, / for multiplication and division. "x * y" returns the product of "x" and "y". "x / y" returns the quotient, rounded down to the next lowest integer. o +, - for addition and subtraction. "x + y" returns the sum of "x" and "y". "x - y" returns the difference. o mod(k, v) for modulo. "mod(x,y)" returns "x" modulo "y"; x - y * (x / y). o log2(w) for logarithm to base 2. Log2(x) returns the smallest integer k where x <= 2^k, i.e. it returns the smallest number of bits in which value x can be stored. 4.4.5 Comparison Operators The following "comparison" operators are available, which take integer arguments and return a boolean result: o ==, !=, for equality and its negative. "x == y" returns true if x is equal to y. Returns false otherwise. "x != y" returns true if x is not equal to y. Returns false otherwise. o <, >, for less than and greater than. "x < y" returns true if x is less than y. Returns false otherwise. "x > y" returns true if x is greater than y. Returns false otherwise. o >=, <=, for less than or equal and greater than or equal, the inverse functions of <, >. "x >= y" returns false if x is less than y. Returns true otherwise. "x <= y" returns false if x is greater than y. Returns true otherwise. 4.5 let Statements A "let" statement takes a boolean expression as a parameter. It can be used to assert that the expression has a specific value, in order to choose a particular packet format from a list of possible formats specified in a structure (see Section 4.9) let () Finking, et al. Expires August 25, 2005 [Page 16] Internet-Draft ROHC-FN February 2005 A "let" statement must only be used inside a field encodings list (see Section 4.9). There are three possible results when an expression is asserted in a let statement: o The boolean expression evaluates to false, in which case the assertion fails, o All terms in the boolean expression are defined and it evaluates to true, in which case the assertion succeeds, o Some or all of the terms in the boolean expression are undefined. If the undefined terms had the correct values the expression would evaluate to true. In this case the undefined terms become bound by the expression and the assertion succeeds. If asserting the boolean expression fails, the packet format containing the expression fails, i.e. the packet format it belongs to cannot be selected by the compressor. "let" is a reserved word. 4.6 Comments Comments do not affect the formal meaning of what is notated, but can be used to improve readability. Their use is optional. Free English text can be inserted into a profile definition to explain why something has been done a particular way, to clarify the intended meaning of the notation, or to elaborate on some point. To this end, the two commenting styles described in the subsections below can be used. Comments may help provide clarifications to the reader, and serve different purposes to implementers. Comments should thus not be considered of lesser importance when inserting then into the formal definition of a profile; these should be consistent with the normative part of the profile. 4.6.1 End of line comments The end of line comment style makes use of the "%" comment character. Any text between the "%" character and the end of the line has no formal meaning. For example: Finking, et al. Expires August 25, 2005 [Page 17] Internet-Draft ROHC-FN February 2005 %----------------------------------------------------------------- % IR-REPLICATE packet formats %----------------------------------------------------------------- % The following fields are included in all of the IR-REPLICATE % packet formats: % uc_format = discriminator, % [ 8 ] bits tcp.seq_number, % [ 32 ] bits tcp.flags.ecn, % [ 2 ] bits 4.6.2 Block comments The block comment style makes use of the "/*" and "*/" delimiters. Any text between the "/*" and the "*/" has no formal meaning. For example: /****************************************************************** * IR-REPLICATE packet formats *****************************************************************/ /* The following fields are included in all of the IR-REPLICATE * packet formats: */ uc_format = discriminator, /* 8 bits */ tcp.seq_number, /* 32 bits */ tcp.flags.ecn, /* 2 bits */ The block comment style allows comments to be nested, unlike some programming languages such as C, C++ or Java. 4.7 Library Encoding Methods ROHC (RFC 3095 [4]) contains a number of different techniques for compressing header fields (LSB encoding, value encoding, etc.). Most of these techniques are part of the ROHC-FN library so that they can be reused when creating new ROHC profiles. The notation for these is described below. Encoding methods can be defined using structures (see section Section 4.9). It is also possible for a profile to define its own set of encoding methods using the formal notation or using a textual definition. 4.7.1 uncompressed_value The "uncompressed_value" encoding method is used to encode header fields for which the uncompressed value can be defined using a mathematical expression (including constant values): Finking, et al. Expires August 25, 2005 [Page 18] Internet-Draft ROHC-FN February 2005 field ::= uncompressed_value(, ); where the value of the "uncomp_length_expression" binds with the field's "uncomp_length" attribute, and the value of the "uncomp_value_expression" binds with the field's "uncomp_value" attribute. The "comp_length" attribute is bound to zero since the field does not appear in the compressed header. Note however that it is still legal to refer to it in a compressed format field order list, but it has a length of zero. The "comp_value" attribute is not bound by this encoding method. As an example of the usage of "uncompressed_value" encoding, the IPv6 header version number is a four bit field that always has the value 6: version ::= uncompressed_value (4, 6); Another example of value encoding, using an expression to calculate the length: padding ::= uncompressed_value(nbits - 8, 0); 4.7.2 compressed_value The "compressed_value" encoding method is used to define fields in the compressed header for which there is no counter-part in the uncompressed header. It can be used to set compressed fields whose value can be defined using a mathematical expression (including constant values): field ::= compressed_value(, ); where the value of the "comp_length_expression" binds with the field's "comp_length" attribute, and the value of the "comp_value_expression" binds with the field's "comp_value" attribute. The "uncomp_length" attribute is bound to zero since the field does not appear in the uncompressed header. Note however that it is still legal to refer to it in an uncompressed format field order list, but it has a length of zero. The "uncomp_value" attribute is not bound by this encoding method. One possible use of this encoding method is to define padding in the compressed header: pad_to_octet_boundary ::= compressed_value (3, 0); Finking, et al. Expires August 25, 2005 [Page 19] Internet-Draft ROHC-FN February 2005 A more common use is to define a discriminator field to make it possible to differentiate between different packet formats within a structure. For convenience, the notation provides syntax for specifying value encoding in the form of a binary string. The binary string to be encoded is simply given in single quotes. For example: discriminator ::= '01101'; This has exactly the same meaning as: discriminator ::= compressed_value(5, 13); 4.7.3 irregular The "irregular" encoding method is used to encode a field in the compressed packet with a bit pattern identical to the original field in the uncompressed packet. e.g. field ::= irregular (); where the value of "expression" binds with the "uncomp_length" attribute of the field. For example, the checksum field of the TCP header is a sixteen bits field that does not follow any pattern: tcp_checksum ::= irregular (16); The expression can be used to derive the length of the field from the value of another field, and the length does not have to be constant. 4.7.4 static The "static" encoding method compresses a field whose length and value are the same as for a previous header in the flow, i.e. where the field completely matches an existing entry in the context: field ::= static; The field's "uncomp_value" and "uncomp_length" attributes bind with their respective values in the context. Since the field value is the same as a previous field value, the entire field can be reconstructed from the context, so it is compressed to zero bits and does not appear in the compressed header. For example, the source port of the TCP header is a field whose value Finking, et al. Expires August 25, 2005 [Page 20] Internet-Draft ROHC-FN February 2005 does not change from one packet to the next for a given flow: src_port ::= static; 4.7.5 lsb The Least Significant Bit encoding method, "lsb", compresses a field whose value differs by a small amount from the value stored in the context. field ::= lsb (num_lsbs_param, offset_param); Here, "num_lsbs_param" is the number of least significant bits to use, and "offset_param" is the interpretation interval offset. The parameter "num_lsbs_param" binds with the "comp_length" attribute, and the "uncomp_value" attribute binds with (context_value - offset_param + comp_value). The "lsb" encoding method can compress a field whose value lies between (context_value - offset_param) and (context _value - offset_param + 2^num_lsbs_param - 1) inclusively. In particular, if offset_param = 0 then the field value can only stay the same or increase relative to the previous header in the flow. If offset_param = -1 then it can only increase, whereas if offset_param = 2^num_lsbs_param then it can only decrease. The compressor may not be capable to determine the exact context value that will be used by the decompressor, since some packets that would have updated the context may have been lost or damaged. However, from feedback received or by making assumptions, the compressor can limit the candidate set of values. The compressor then chooses an encoding such that no matter which context value in the candidate set the decompressor uses, the resulting decompression is correct. If that is not possible, the lsb encoding method fails (which typically results in a less efficient packet format being chosen by the compressor). As "reasonable" assumptions may not always be correct, lsb encoding is intended to be used in conjunction with methods that validate the output of the decompression process, such as the crc method described in Section 4.7.6. The compressed field takes up the specified number of bits in the compressed header (i.e. num_lsbs_param). For example, the tcp sequence number: tcp_sequence_number ::= lsb (14, 8192); Finking, et al. Expires August 25, 2005 [Page 21] Internet-Draft ROHC-FN February 2005 See the ROHC specification (RFC 3095 [4]) for additional details on LSB encoding, where the parameter "k" corresponds to the parameter "num_lsbs_param" and where interpretation interval offset "p" corresponds to the parameter "offset_param". 4.7.6 crc The "crc" encoding method provides a CRC calculated over a block of data. The block of data is represented using either the "uncomp_value" or "comp_value" attribute of a field. The "crc" method takes a number of parameters: o the number of bits for the CRC (crc_bits), o the bit-pattern for the polynomial (bit_pattern), o the initial value for the CRC register (initial_value), o the value of the block of data (block_data_value); and o the size in octets of the block of data (block_data_length). I.e.: field ::= crc (num_bits, bit_pattern, initial_value, block_data_value, block_data_length); The CRC is calculated in least significant bit (LSB) order. The following CRC polynomials are defined in RFC 3095 [4], in Sections 5.9.1 and 5.9.2: 8-bit C(x) = x^0 + x^1 + x^2 + x^8 bit_pattern = 0xe0 7-bit C(x) = x^0 + x^1 + x^2 + x^3 + x^6 + x^7 bit_pattern = 0x79 3-bit C(x) = x^0 + x^1 + x^3 bit_pattern = 0x06 For example: % 3 bit CRC, C(x) = x^0 + x^1 + x^3 crc_field ::= crc(3, 0x6, 0xF, this:comp_value, this:comp_length); Finking, et al. Expires August 25, 2005 [Page 22] Internet-Draft ROHC-FN February 2005 4.8 Profile-specific Encoding Methods The library of encoding methods provides a basic and a generic set of field encoding methods. Some additional encodings specific to a particular protocol may however be needed, such as for methods that infer the value of a field from other values. These methods are defined based on the properties of the protocol being compressed. Profiles may define additional encoding methods; the scope of these methods is then local to the profile definition itself, and they can be used as part of the formal definition of the profile as any other methods from the library (see section Section 4.7). Profile-specific encoding methods must be rigorously defined using either the ROHC-FN syntax or in plain text, as long as its definition provides enough information to unambiguously implement the encoding method in the compressor and the decompressor. These methods should be no less complete than the methods provided herein. 4.9 Structures Structures are used for defining new encoding methods in a formal specification. They compose groups of individual fields into contiguous blocks. Structures can be thought of as compound encoding methods; they have names and may have parameters and can be used in the same way as any other encoding method. Since structures can contain references to other structures, complicated headers can be broken down into manageable pieces. This section describes the various features of structures, starting out with the simplest. 4.9.1 "this" Within a structure it is possible to refer to the field it is encoding, using the keyword "this". This is useful for gaining access to the attributes of the field being encoded. For example it is often useful to know the total uncompressed length of the header which is being encoded. 4.9.2 Simple Structures A structure can be used to specify a single fixed encoding. This is its simplest form. For example: Finking, et al. Expires August 25, 2005 [Page 23] Internet-Draft ROHC-FN February 2005 compound_encoding_method === { uc_format = field_1, % [ 4 ] field_2; % [ 12 ] co_format = field_2, % [ 0 ] field_1 % [ 4 ] { field_1 ::= irregular (4); field_2 ::= uncompressed_value (12, 9); }; }; The above begins with the structure's identifier, "compound_encoding_method". The identifier is followed by "===", which indicates that this is a structure definition. The definition of the structure then follows inside curly braces, "{" and "}". The first item in the definition is the "uc_format" field order list, which gives the order of the fields in the uncompressed header. This is followed by the compressed header field order list. This list is in turn followed by the field encodings list for the compressed header, which gives the encoding method for each field. The different components of this example are described in more detail below. 4.9.2.1 Uncompressed Format The uncompressed field order list is defined by "uc_format", which specifies the fields of the uncompressed header in the order that appear in the uncompressed header. In the example, this is "field_1" followed by "field_2". This means that a field being encoded by this structure is divided into two subfields, "field_1" and "field_2". The total uncompressed lengths of these two fields therefore equals the length of the field being encoded. Formally: field_1:uncomp_length + field_2:uncomp_length == this:uncomp_length In the example we have just two fields but any number of subfields may be used. This relationship applies to however many fields are actually used.Note that the arrangement of fields specified in the uncompressed field order list is up to the notator. Any arrangement of fields that correctly describes the content of the uncompressed header may be chosen -- this need not be the same as the one described in the specifications for the protocol header being compressed. However, the bits of the uncompressed format must remain in the same order. For example, there may be a protocol whose header contains a 16 bits Finking, et al. Expires August 25, 2005 [Page 24] Internet-Draft ROHC-FN February 2005 sequence number, but whose sessions tend to be short lived. This would mean that the high bits of the sequence number are almost always constant. The "uc_format" could reflect this by splitting the original uncompressed field into two fields, one field to represent the almost-always-zero part of the sequence number, and a second field to represent the significant part. An uncompressed format may contain a field encodings list. Encoding methods specified therein are used whenever a packet with that uncompressed format is being encoded. The encoding of a packet with a given uncompressed format can only succeed if all of its encoding methods and let statements succeed (see Section 4.5). The total length of an uncompressed header must be defined. The length of each of the fields in an uncompressed header must also be defined. This means that the bindings in the "uc_format", "co_format" and "default_methods" (see below) field encodings lists must between them define the "uncomp_length" attribute of evey field in an uncompressed header so that there is an unambiguous mapping from the bits in the uncompressed header to the fields listed in each "uc_format" field order list. 4.9.2.2 Compressed Format Similar to the uncompressed field order list, the compressed data will appear in the order specified by the compressed field order list given for a compressed format. Each individual field is encoded in the manner given for that field in the field encodings list, which is in braces and follows immediately after the compressed field order list. The total length of the compressed data will be the total of the compressed lengths of all the individual fields. The annotation for these fields indicates that they are zero and 4 bits long, making a total of 4 bits. Note that the order of the fields specified in a compressed format field order list, does not have to match the order they appear in the "uc_format" field order list. It may be desirable to reorder the fields in the compressed header for alignment the compressed header to the octet boundary, or for other reasons. In the above example, the order is in fact the opposite of that in the uncompressed header. The field encodings list specifies that the encoding for "field_1", is "irregular", which takes up four bits in both the compressed header and uncompressed header. The encoding for "field_2" is "uncompressed_value", which means that the field has a fixed value, so it can be compressed to zero bits. The value it takes is 9, and it is 12 bits wide in the uncompressed header. Finking, et al. Expires August 25, 2005 [Page 25] Internet-Draft ROHC-FN February 2005 Fields like "field_2", which compress to zero bits in length, may be omitted from the compressed field order list. This is because their position in the list is not significant. So, without changing the meaning, the above example could be notated as follows: compound_encoding_method === { uc_format = field_1, % [ 4 ] field_2; % [ 12 ] co_format = field_1 % [ 4 ] { field_1 ::= irregular (4); field_2 ::= uncompressed_value (12, 9); }; }; The total length of a compressed header must be defined. The length of each of the fields in a compressed header must also be defined. This means that the bindings in the "uc_format", "co_format" and "default_methods" (see below) field encodings lists must between them define the "comp_length" attribute of evey field in a compressed header so that there is an unambiguous mapping from the bits in the compressed header to the fields listed in each "co_format" field order list. 4.9.3 Arguments and Structures Structures may take arguments, which have some control over the mapping between compressed and uncompressed fields. These are specified immediately after the structure name, in parentheses, as a comma separated list. For example: poor_mans_lsb(variable_length) === { uc_format = constant_bits, variable_bits; co_format = variable_bits { constant_bits ::= static; variable_bits ::= irregular(variable_length); }; }; As with any encoding method, all arguments are values, rather than fields. Although entire fields cannot be passed as arguments, it is possible to pass their attributes instead. Finking, et al. Expires August 25, 2005 [Page 26] Internet-Draft ROHC-FN February 2005 4.9.4 Multiple Formats Structures can also define multiple formats for a given header. This allows different compression methods to be used depending on what is the most efficient way of compressing a particular header. For example, a field may have a fixed value most of the time, but the fixed value may occasionally change. Using a single format for the structure, this field would have to be encoded using "irregular" (see Section 4.7.3), even though the value only changes rarely. However, by using the structure to define multiple formats, we can provide two alternative encodings; one for when the value remains fixed and another for when the value changes. This is the topic of the following sub-sections. 4.9.4.1 Naming Convention When compressed formats are defined, they must be defined using names beginning with the reserved prefix "co_format". Similarly uncompressed formats must be defined using names beginning with "uc_format". Format names must be unique within the structure to which they belong. 4.9.4.2 Format Discrimination Each of the compressed formats has its own field order list and field encodings list. A compressor may pick any of these alternative formats to compress a header, as long the field encodings it employs can be used with the uncompressed header. For example, the compressor could not choose to use a compressed format that had a "static" encoding for a field whose value had just changed. More formally, the compressor can choose any combination of an uncompressed format and a compressed format for which all fields "succeed", i.e. the encoding methods and let-statements succeed (see Section 4.5). If there are multiple successful combinations, the compressor can choose any one. Otherwise if there is no successful combination, the encoding method defined by the structure "fails". Because the compressor has a choice, it must be possible for the decompressor to discriminate between the different packet formats. A simple approach to this problem is for each compressed format to include a "discriminator" that uniquely identifies that particular "co_format". A discriminator is a control field; it is not derived from any of the uncompressed field values (see Section 4.7.2). Finking, et al. Expires August 25, 2005 [Page 27] Internet-Draft ROHC-FN February 2005 4.9.4.3 Default Encoding Methods - default_methods When using multiple packet formats, default bindings may be specified for each field or attribute. The default encoding methods specify the encoding method to use for a field if no encoding method for that field elsewhere. This is helpful to keep the definition of the packet formats concise, as the same encoding method need not be repeated for every format. The syntax for specifying default bindings is similar to that used to specify a compressed or uncompressed format. However there is no field order list for the default encoding methods, only the field encodings list is given. The field order is specified individually for each "co_format" and "uc_format". For example: default_methods = { field_1 ::= uncompressed_value (4,1); field_2 ::= uncompressed_value (4,2); field_3 ::= lsb(3,-1); let(field_4:uncomp_length == 4); }; Fields for which there is a default encoding method do not need to be specified in the field encodings list of any format that uses the default encoding method for that field. The default encoding method for a field may be overridden by specifying explicitly an encoding method for that field. If a default encoding method is not overridden, and that encoding method always compresses the field down to zero bits, then the field can also be omitted from the compressed format field order list, since, like any other zero bit field, its position in the field order list is not significant. The field encodings list of default_methods may also contain default bindings for individual attributes by using "let" statements. If a default binding is given for an individual attribute, that binding may be overridden by another binding for that attribute or the field to which it belongs. The overriding binding may either be another let statement, or an encoding method. It is allowed to override one default binding but still use another. Overriding one default binding does not imply that other default bindings are also being overridden. It is also allowed to supply default bindings for some but not all fields. Note that any and all default methods can be overridden. Therefore to notate that a "let" statement or encoding method must be applied to every compressed format of a structure, the "uc_format" field Finking, et al. Expires August 25, 2005 [Page 28] Internet-Draft ROHC-FN February 2005 encodings list(s) should be used. "uc_format" field encodings lists can not be overridden. 4.9.4.4 Example of Multiple Formats Putting this altogether, here is a complete example of a structure with multiple compressed formats: test_multiple_formats === { uc_format = field_1, % [ 4 ] field_2, % [ 4 ] field_3; % [ 24 ] default_methods = { field_1 ::= static; field_2 ::= uncompressed_value(4, 2); field_3 ::= lsb(4, 0); }; co_format_0 = discriminator, % [ 1 ] field_3 % [ 4 ] { discriminator ::= '0'; }; co_format_1 = discriminator, % [ 1 ] field_1, % [ 4 ] field_3 % [ 24 ] { discriminator ::= '1'; field_1 ::= irregular(4); field_3 ::= irregular(24); }; }; Note the following: o "field_1" and "field_3" both have default encoding methods specified for them, which are used in "format_0", but is overridden in "format_1"; "field_2" however is not overridden. o "field_1" and "field_2" have default encoding methods which compress to zero bits. When these are used in "co_format_0", the field names do not appear in either the field order list or in the field encodings list. o "field_3" has an encoding method which does not compress to zero bits, so whilst "field_3" is absent from the field encoding list of "format_0"', it still needs to appear in the field order list Finking, et al. Expires August 25, 2005 [Page 29] Internet-Draft ROHC-FN February 2005 to specify whereabouts it goes in the compressed packet. o in the example, all the uncompressed header fields have default encoding methods specified for them, but this is not a requirement. It is perfectly allowable to only specify default encodings for some or even none of the uncompressed header fields. o in the example all the default encoding methods are on fields from the uncompressed header, but this is not a requirement. It is perfectly allowable to specify default encoding methods for control fields. 4.9.5 Control Fields Control fields are defined using the "control_fields" list, which specifies any fields that do not appear in the uncompressed header but which have an uncompressed value (specifically those with a non-zero uncomp_length). Such fields may be used to help compress fields from the uncompressed header more efficiently. A control field could be used to improve efficiency by representing some commonality between a number of the uncompressed fields, or by representing some information about the flow that is not explicitly contained in the protocol headers. For example in IP, the behaviour of the IP ID field in a flow varies depending on how the endpoints handle IP IDs. Sometimes the behaviour is effectively random, sometimes the IP ID follows a predictable sequence, and at other times it stays fixed at zero. This information is never communicated explicitly in the uncompressed header, but to compress the field efficiently, its behaviour must be communicated somehow. A control field is used: Finking, et al. Expires August 25, 2005 [Page 30] Internet-Draft ROHC-FN February 2005 ipv4 === { uc_format = version, %[ 4 ] hdr_length, %[ 4 ] protocol, %[ 8 ] tos_tc, %[ 6 ] ip_ecn_flags,%[ 2 ] ttl_hopl, %[ 8 ] df, %[ 1 ] mf, %[ 1 ] rf, %[ 1 ] frag_offset, %[ 13 ] ip_id, %[ 16 ] src_addr, %[ 32 ] dst_addr, %[ 32 ] checksum, %[ 16 ] length; %[ 16 ] control_fields = ip_id_behavior; %[ 2 ] : : }; The control_fields list is equivalent to the "uc_format" field order list for fields that do not appear in the uncompressed header, that is it defines a field that has the same properties (the same attributes etc) as fields appearing in the uncompressed header. Control fields are initialised by using the appropriate encoding methods and/or by using let statements. For example this control field is used to scale down a field in the uncompressed header by a factor of 8 before encoding it with LSB encoding. Scaling it down makes the LSB encoding more efficient. example_struct() === { uc_format = field_1; control_fields = ctrl_field; format = ctrl_field { let(ctrl_field:uncomp_value == field_1:uncomp_value / 8); let(ctrl_field:uncomp_length == field_1:length - 3); ctrl_field ::= lsb(4, 0); }; }; Finking, et al. Expires August 25, 2005 [Page 31] Internet-Draft ROHC-FN February 2005 Control fields may also be used with global scope. In this case their declaration must be outside of any structure. They are then visible within any structure. 5. Security considerations This draft describes a formal notation similar to ABNF RFC 2234 [3], and hence is not believed to raise any security issues. 6. Contributors Although no longer listed as an author, Richard Price did almost all of the foundational work on the formal notation and also produced the original formal notation internet draft on which this document is based. Many thanks to him for doing that groundwork on which this document stands. 7. Acknowledgements A number of important concepts and ideas have been borrowed from ROHC RFC 3095 [4]. Thanks to Mark West, Eilert Brinkmann and particularly Kristofer Sandlund for their cooperation and feedback from notating the TCP profile, and also for their review comments. Thanks to Rob Hancock and Stephen McCann for early work on the formal notation. The authors would also like to thank Christian Schmidt, Qian Zhang, Hongbin Liao, Max Riegel and Lars-Erik Jonsson for their comments and valuable input. Finally thanks to Caroline Daniels and Alan Finney for doing some excellent last minute review work. 8. References [1] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [3] Crocker, D. and P. Overall, "Augmented BNF for Syntax Specifications: ABNF", RFC 2234, November 1997. [4] Bormann, C., Burmeister, C., Degermark, M., Fukushima, H., Hannu, H., Jonsson, L-E., Hakenberg, R., Koren, T., Le, K., Liu, Z., Martensson, A., Miyazaki, A., Svanbro, K., Wiebke, T., Finking, et al. Expires August 25, 2005 [Page 32] Internet-Draft ROHC-FN February 2005 Yoshimura, T. and H. Zheng, "RObust Header Compression (ROHC): Framework and four profiles: RTP, UDP, ESP, and uncompressed", RFC 3095, July 2001. Authors' Addresses Robert Finking Siemens/Roke Manor Roke Manor Research Ltd. Romsey, Hampshire SO51 0ZN UK Phone: +44 (0)1794 833189 Email: robert.finking@roke.co.uk URI: http://www.roke.co.uk Carsten Bormann Universitaet Bremen TZI Postfach 330440 Bremen D-28334 Germany Phone: +49 421 218 7024 Fax: +49 421 218 7000 Email: cabo@tzi.org Ghyslain Pelletier Ericsson AB Box 920 LuleÈÑ SE-971 28 Sweden Phone: +46 (0) 8 404 29 43 Email: ghyslain.pelletier@ericsson.com Appendix A. Syntax This section gives a formal definition of the ROHC-FN syntax in ABNF (see RFC 2234 [3]). A.1 Reserved Keywords Some keywords are defined and reserved in ROHC-FN. These keywords cannot be reused as identifiers by the notator. Finking, et al. Expires August 25, 2005 [Page 33] Internet-Draft ROHC-FN February 2005 o co_format - struct syntax o comp_hdr_start - attribute o comp_length - attribute o comp_value - attribute o compressed_value - primitive encoding method o default_methods - struct syntax o irregular - primitive encoding method o let - primitive encoding method o lsb - primitive encoding method o static - primitive encoding method o uc_format - struct syntax o uncomp_hdr_start - attribute o uncomp_length - attribute o uncomp_value - attribute o uncompressed_value - primitive encoding method reserved_word ::= primitive_encoding_method_name | attribute_identifier | struct_reserved_words A.2 Characters Because ABNF [3] symbols are case insensitive, it is necessary to define explicit symbols for each of the lower case characters which we use in the reserved words of our grammar. Fortunately there are no fundamental components of the FN syntax which are in upper case, otherwise we would have to define each capital letter separately also. a = %x61 b = %x62 c = %x63 d = %x64 e = %x65 f = %x66 g = %x67 h = %x68 i = %x69 j = %x6a Finking, et al. Expires August 25, 2005 [Page 34] Internet-Draft ROHC-FN February 2005 k = %x6b l = %x6c m = %x6d n = %x6e o = %x6f p = %x70 q = %x71 r = %x72 s = %x73 t = %x74 u = %x75 v = %x76 w = %x77 x = %x78 y = %x79 z = %x7a lower-case-letter = %x61-7a ; a-z upper-case-letter = %x41-5a ; A-Z binary-digit = "0" / "1" octal-digit = binary-digit / "2" / "3" / "4" / "5" / "6" / "7" decimal-digit = octal-digit / "8" / "9" hexadecimal-digit = decimal-digit / %x61-66 open-bracket = "(" close-bracket = ")" Finking, et al. Expires August 25, 2005 [Page 35] Internet-Draft ROHC-FN February 2005 open-brace = "{" close-brace = "}" equals-sign = "=" underscore = "_" comma = "," semi-colon = ";" single-quote = "'" A.3 Literals decimal-literal = 1*decimal-digit binary-literal = "0".b 1*binary-digit octal-literal = "0".o 1*octal-digit hexadecimal-literal = "0".x 1*hexadecimal-digit numeric-literal = decimal-literal / binary-literal / octal-literal / hexadecimal-literal boolean-literal = t.r.u.e / f.a.l.s.e A.4 Identifiers lower-case-identifier = (lower-case-letter *(lower-case-letter / decimal-digit / underscore)) ; The original EBNF had "- reserved-word" here, meaning "except reserved words", but ABNF has no equivalent construct. Notwithstanding this fact, any automated tool should enforce the reservation of reserved words in this fashion. upper-case-identifier = upper-case-letter *(upper-case-letter / decimal-digit / underscore) A.5 Operators exponential-operator = "^" multiplicative-operator = "*" / "/" additive-operator = "+" / "-" Finking, et al. Expires August 25, 2005 [Page 36] Internet-Draft ROHC-FN February 2005 unary-minus = "-" A.6 Expressions parenthesised-expression = open-bracket arithmetic-expression close-bracket primitive-expression = numeric-literal / constant-name / field-attribute / parenthesised-expression / (unary-minus primitive-expression) exponential-expression = primitive-expression *(exponential-operator primitive-expression) multiplicative-expression = exponential-expression *(multiplicative-operator exponential-expression) additive-expression = multiplicative-expression *(additive-operator multiplicative-expression) arithmetic-expression = additive-expression A.7 Constants constant-name = upper-case-identifier constant-value = constant-name / expression constant-definition = constant-name equals-sign constant-value A.8 Field Names field-name = lower-case-identifier annotated-field-name = field-name [ "[" constant "]" ] A.9 Attributes attribute-category = (c.o.m.p) / (u.n.c.o.m.p) attribute-name = (l.e.n.g.t.h) / (v.a.l.u.e) / (h.d.r.underscore.s.t.a.r.t) attribute-identifier = attribute-category underscore attribute-name field-attribute = field-name ":" attribute-identifier Finking, et al. Expires August 25, 2005 [Page 37] Internet-Draft ROHC-FN February 2005 A.10 Encoding Methods primitive-encoding-method-name = (c.o.m.p.r.e.s.s.e.d.underscore.v.a.l.u.e) / (i.r.r.e.g.u.l.a.r) / (l.s.b) / (s.t.a.t.i.c) / (u.n.c.o.m.p.r.e.s.s.e.d.underscore.v.a.l.u.e) uncompressed-value-shorthand = single-quote *binary-digit single-quote external-encoding-method-name = underscore lower-case-identifier non-primitive-encoding-method-name = structure-name / external-encoding-method-name encoding-method-parameter-list = open-bracket arithmetic-expression *(comma arithmetic-expression) close-bracket encoding-method = uncompressed-value-shorthand / (encoding-method-name [encoding-method-parameter-list]) field-encoding = field-name "::=" encoding-method A.11 Structures structure-name = lower-case-identifier field-order-list = [ annotated-field-name *(comma annotated-field-name) ] field-encodings-list = open-brace *(field-encoding semi-colon) close-brace uncompressed-format-prefix = (u.n.c.o.m.p.r.e.s.s.e.d.underscore.f.o.r.m.a.t) uncompressed-format = uncompressed-format-prefix [underscore lower-case-identifier] equals-sign field-order-list; semi-colon compressed-format-prefix = (c.o.m.p.r.e.s.s.e.d.underscore.f.o.r.m.a.t) compressed-format = compressed-format-prefix [underscore lower-case-identifier] equals-sign field-order-list field-encodings-list semi-colon default-methods-id ::= (d.e.f.a.u.l.t.underscore.m.e.t.h.o.d.s) Finking, et al. Expires August 25, 2005 [Page 38] Internet-Draft ROHC-FN February 2005 default-methods = default-methods-id equals-sign field-encodings-list semi-colon uncompressed-format-list = *uncompressed-format compressed-format-list = 1*compressed-format structure-body = open-brace uncompressed-format-list [default-methods] compressed-format-list close-brace structure-definition = structure-name "===" structure-body semi-colon struct-reserved-words = uncompressed-format-prefix / compressed-format-prefix / default-methods-id; Appendix B. Bit-level Worked Example This section gives a worked example at the bit level, showing how a simple profile describes the compression of real data from an imaginary protocol header. The example used has been kept fairly simple, whilst still aiming to illustrate some of the intricacies that arise in use of the notation. In particular, fields have been kept short to make it possible to read the binary representation of the headers by eye, without too much difficulty. B.1 Example Packet Format Our imaginary header is just 16 bits long, and consists of the following fields: 1. version number - 2 bits 2. type - 2 bits 3. flow id - 4 bits 4. sequence number - 4 bits 5. flag bits - 4 bits So for example 0101000100010000 indicates a packet with a version number of one, a type of one, a flow id of one, a sequence number of one, and all flag bits set to zero. B.2 Initial Encoding An initial definition based solely on the above information is: Finking, et al. Expires August 25, 2005 [Page 39] Internet-Draft ROHC-FN February 2005 eg_header === { uc_format = version_no, % [ 2 ] type, % [ 2 ] flow_id, % [ 4 ] sequence_no, % [ 4 ] flag_bits; % [ 4 ] co_format_initial = version_no, % [ 2 ] type, % [ 2 ] flow_id, % [ 4 ] sequence_no, % [ 4 ] flag_bits % [ 4 ] { version_no ::= irregular(2); type ::= irregular(2); flow_id ::= irregular(4); sequence_no ::= irregular(4); flag_bits ::= irregular(4); }; }; This defines the packet format nicely, but doesn't actually offer any compression. If we use it to encode the above header, we get: Uncompressed header: 0101000100010000 Compressed header: 0101000100010000 This is because we have stated that all fields are irregular - i.e. we haven't specified anything about their behaviour. B.3 Basic Compression In order to achieve any compression we need to notate more knowledge about the header and it's behaviour in a flow. For example, we may know the following facts about the header: 1. version number - indicates which version of the protocol this is, always one for this version of the protocol 2. type - may take any value. 3. flow id - may take any value. 4. sequence number - make take any value 5. flag bits - contains three flags, a, b and c, each of which may be set or clear, and a reserved flag bit, which is always clear (i.e. zero). We could notate this knowledge as follows: Finking, et al. Expires August 25, 2005 [Page 40] Internet-Draft ROHC-FN February 2005 eg_header === { uc_format = version_no, % [ 2 ] type, % [ 2 ] flow_id, % [ 4 ] sequence_no, % [ 4 ] abc_flag_bits, % [ 3 ] reserved_flag; % [ 1 ] co_format_basic = version_no, % [ 0 ] type, % [ 2 ] flow_id, % [ 4 ] sequence_no, % [ 4 ] abc_flag_bits, % [ 3 ] reserved_flag % [ 0 ] { version_no ::= uncompressed_value(2,1); type ::= irregular(2); flow_id ::= irregular(4); sequence_no ::= irregular(4); abc_flag_bits ::= irregular(3); reserved_flag ::= uncompressed_value(1,0); }; }; Using this simple scheme, we have successfully encoded the fact that one of the fields has a permanently fixed value of one, and therefore contains no useful information. We have also encoded the fact that the final flag bit is always zero, which again contains no useful information. Both of these facts have been notated using the uncompressed_value encoding method (see Section 4.7.1) Note that we could have omitted the "0 bits" fields from the definition of the compressed_data if we wished, since the only purpose of that list is to indicate the order in the compressed header; zero bit fields don't actually appear and so can be omitted from the field order list. Using this new encoding on the above header, we get: Uncompressed header: 0101000100010000 Compressed header: 0100010001000 Which reduces the amount of data we need to transmit by roughly 20%. However, this encoding fails to take advantage of relationships between values of a field in one packet and its value in subsequent packets. For example, every header in the following sequence is compressed by the same amount despite the similarities between them: Finking, et al. Expires August 25, 2005 [Page 41] Internet-Draft ROHC-FN February 2005 Uncompressed header: 0101000100010000 Compressed header: 0100010001000 Uncompressed header: 0101000101000000 Compressed header: 0100010100000 Uncompressed header: 0111000101110000 Compressed header: 1100010111000 B.4 Inter-packet compression The profile we have defined so far has not compressed the sequence number or flow ID fields at all, since they can take any value. However the value of these fields in one header has a very simple relationship to their value in previous headers: o the sequence number is unusual, it increases by three each time, o the flow_id stays the same, it always has the same value that it did in the previous header in the flow, o the abc_flag_bits stay the same most of the time, they usually have the same value that they did in the previous header in the flow, An obvious way of notating this is as follows: % This obvious encoding will not work (correct encoding below) eg_header === { uc_format = version_no, % [ 2 ] type, % [ 2 ] flow_id, % [ 4 ] sequence_no, % [ 4 ] abc_flag_bits, % [ 3 ] reserved_flag; % [ 1 ] co_format_obvious = type, % [ 2 ] abc_flag_bits % [ 3 ] { version_no ::= uncompressed_value(2,1); type ::= irregular(2); flow_id ::= static; sequence_no ::= lsb(0,-3); abc_flag_bits ::= irregular(3); reserved_flag ::= uncompressed_value(1,0); }; }; Finking, et al. Expires August 25, 2005 [Page 42] Internet-Draft ROHC-FN February 2005 This dependency on previous packets is notated using the static and LSB encoding methods (see Section 4.7.4 and Section 4.7.5 respectively). However there are a few problems with the above notation. Firstly, and most importantly, the flow_id field is notated as "static" which means that it doesn't change from packet to packet. However, the notation does not indicate how to communicate the value of the field initially. It's all very well saying "it's the same value as last time", but there must have been a first time where we define what that value is, so that it can be referred back to. The above notation provides no way of communicating that. Similarly with the sequence number - there needs to be a way of communicating its initial value. Secondly, the sequence number field is communicated very efficiently in zero bits, but it is not at all robust against packet loss. If a packet is lost then there is no way to handle the missing sequence number. Finally, although the flag bits are usually the same as in the previous header in the flow, the profile doesn't make any use of this fact; since they are sometimes not the same as those in the previous header, it is not safe to say that they are always the same, so static encoding can't be used solely. We solve all three of these problems below, robustness first, since it is simplest. When communicating sequence numbers, or any other field encoding with LSB encoding, a very important consideration for the notator is how robust against packet loss the compressed protocol should be. This will vary a lot from protocol stack to protocol stack. For example RTP has a high setup cost, so the compressed stream needs to be robust against fairly high packet loss. Things are different with TCP, where robustness to loss of just a few packets is sufficient. For the example protocol we'll assume short, low overhead flows and say we need to be robust to the loss of just one packet, which we can achieve with two bits of LSB encoding (one bit isn't enough since the sequence number increases by three each time - see Section 4.7.5 ). To communicate initial values for the sequence number and flow ID fields, and to take advantage of the fact that the flag bits are usually the same as in the previous header, we need to depart from the single packet format encoding we are currently using and instead use multiple packet formats: Finking, et al. Expires August 25, 2005 [Page 43] Internet-Draft ROHC-FN February 2005 eg_header === { uc_format = version_no, % [ 2 ] type, % [ 2 ] flow_id, % [ 4 ] sequence_no, % [ 4 ] abc_flag_bits, % [ 3 ] reserved_flag; % [ 1 ] co_format_irregular = discriminator, % [ 1 ] type, % [ 2 ] flow_id, % [ 4 ] sequence_no, % [ 4 ] abc_flag_bits % [ 3 ] { discriminator ::= '0'; version_no ::= uncompressed_value(2,1); type ::= irregular(2); flow_id ::= irregular(4); sequence_no ::= irregular(4); abc_flag_bits ::= irregular(3); reserved_flag ::= uncompressed_value(1,0); }; co_format_compressed = discriminator, % [ 1 ] type, % [ 2 ] sequence_no % [ 2 ] { discriminator ::= '1'; version_no ::= uncompressed_value(2,1); type ::= irregular(2); flow_id ::= static; sequence_no ::= lsb(2,-3); abc_flag_bits ::= static; reserved_flag ::= uncompressed_value(1,0); }; }; Note that we have had to add a discriminator field, so that the decompressor can tell which packet format has been used by the compressor. The format with a static flow ID and LSB encoded sequence number, is now 5 bits long, a saving of over 60% on the size of the single packet format, almost a 70% saving on the size of the uncompressed header. Note that despite having to add the discriminator field, this format is still the same size as the original incorrect naive notation, because this notation takes advantage of the fact that the abc flag bits rarely change. Finking, et al. Expires August 25, 2005 [Page 44] Internet-Draft ROHC-FN February 2005 However, the original packet format (with an irregular flow ID and sequence number) has also grown by one bit due to the addition of the discriminator. An important consideration when creating multiple packet formats is whether the extra format occurs frequently enough that the average compressed header length is shorter as a result. For example, if in fact the flag bits always changed between packet headers, the static encoding could never be used; all we would have achieved is to lengthen the irregular packet format by one bit. Using the above notation, we now get: Uncompressed header: 0101000100010000 Compressed header: 00100010001000 Uncompressed header: 0101000101000000 Compressed header: 10100 ; 00100010100000 Uncompressed header: 0111000101110000 Compressed header: 11100 ; 01100010111000 The first header in the stream is compressed the same way as before, except that it now has the extra 1 bit discriminator at the start (0). When a second header arrives, with the same flow ID as the first and its sequence number three higher, it can now be compressed in two possible ways, either using co_format_compressed or in the same way as previously, using co_format_irregular. Note that we show all possible encodings of a packet as defined by a given profile, separated by semi-colons. Either of the above encodings for the packet could be produced by a valid implementation, although a good implementation would always aim to make the compressed size as small as possible and an optimum implementation would pick the encoding which led to the best compression of the packet stream (which is not necessarily the smallest encoding for a particular packet). B.5 Variable Length Discriminators Suppose we do some analysis on flows of our example protocol and discover that whilst it is usual for successive packets to have the same flags, on the occasions when they don't, the packet is almost always a "flags set" packet, in which all three of the abc flags are set. To encode the flow more efficiently a packet format needs to be written to reflect this. This now gives a total of three packet formats, which means we need Finking, et al. Expires August 25, 2005 [Page 45] Internet-Draft ROHC-FN February 2005 three discriminators to differentiate between them. The obvious solution here is to increase the number of bits in the discriminator from one to two and for example use discriminators 00, 01, and 10. However we can do slightly better than this. Any uniquely identifiable discriminator will suffice, so we can use 00, 01 and 1. If the discriminator starts with 1, that's the whole thing. If it starts with 0 the decompressor knows it has to check one more bit to determine the packet kind. Note that care must be taken when using variable length discriminators. For example it would be erroneous to use 0, 01 and 10 as discriminators since after reading an initial 0, the decompressor would have no way of knowing if the next bit was a second bit of discriminator, or the first bit of the next field in the packet stream. 0, 10 and 11 however would be OK as the first bit again indicates whether or not there are further discriminator bits to follow. This gives us the following: eg_header === { uc_format = version_no, % [ 2 ] type, % [ 2 ] flow_id, % [ 4 ] sequence_no, % [ 4 ] abc_flag_bits, % [ 3 ] reserved_flag; % [ 1 ] co_format_irregular = discriminator, % [ 2 ] type, % [ 2 ] flow_id, % [ 4 ] sequence_no, % [ 4 ] abc_flag_bits % [ 3 ] { discriminator ::= '00'; version_no ::= uncompressed_value(2,1); type ::= irregular(2); flow_id ::= irregular(4); sequence_no ::= irregular(4); abc_flag_bits ::= irregular(3); reserved_flag ::= uncompressed_value(1,0); }; co_format_flags_set = discriminator, % [ 2 ] type, % [ 2 ] sequence_no % [ 2 ] { Finking, et al. Expires August 25, 2005 [Page 46] Internet-Draft ROHC-FN February 2005 discriminator ::= '01'; version_no ::= uncompressed_value(2,1); type ::= irregular(2); flow_id ::= static; sequence_no ::= lsb(2,-3); abc_flag_bits ::= uncompressed_value(3,7); reserved_flag ::= uncompressed_value(1,0); }; co_format_flags_static = discriminator, % [ 1 ] type, % [ 2 ] sequence_no % [ 2 ] { discriminator ::= '1'; version_no ::= uncompressed_value(2,1); type ::= irregular(2); flow_id ::= static; sequence_no ::= lsb(2,-3); abc_flag_bits ::= static; reserved_flag ::= uncompressed_value(1,0); }; Here is some example output: Uncompressed header: 0101000100010000 Compressed header: 000100010001000 Uncompressed header: 0101000101000000 Compressed header: 10100 ; 000100010100000 Uncompressed header: 0111000101110000 Compressed header: 11100 ; 001100010111000 Uncompressed header: 0111000110101110 Compressed header: 011100 ; 001100011010111 Here we have a very similar sequence to last time, except that there is now an extra message on the end which has the flag bits set. The encoding for the first message in the stream is now one bit larger, the encoding for the next two messages is the same as before, since that packet format has not grown, thanks to the use of variable length discriminators. Finally the packet that comes through with all the flag bits set can be encoded in just six bits, only one bit more than the most common packet format. Without the extra packet format, this last packet would have to be encoded using the longest Finking, et al. Expires August 25, 2005 [Page 47] Internet-Draft ROHC-FN February 2005 packet format and would have taken up 14 bits. This represents a saving of almost 60% for this kind of packet. B.6 Default encoding There is some redundancy in the notation used to define the profile in that the same encoding method is used for the same fields several times in different formats, but the field is redefined explicitly each time. If the encoding for any of these fields changed in the future (e.g. if the reserved flag became permanently set to 1 instead of 0), then every packet format would have to be modified to reflect this change. This problem can be avoided by specifying a default encoding for these fields, which also leads to a more concisely notated profile: eg_header === { uc_format = version_no, % [ 2 ] type, % [ 2 ] flow_id, % [ 4 ] sequence_no, % [ 4 ] abc_flag_bits, % [ 3 ] reserved_flag; % [ 1 ] default_methods = { version_no ::= uncompressed_value(2,1); type ::= irregular(2); flow_id ::= static; sequence_no ::= lsb(2,-3); reserved_flag ::= uncompressed_value(1,0); }; co_format_irregular = discriminator, % [ 2 ] type, % [ 2 ] flow_id, % [ 4 ] sequence_no, % [ 4 ] abc_flag_bits % [ 3 ] { discriminator ::= '00'; flow_id ::= irregular(4); % overrides default sequence_no ::= irregular(4); % overrides default abc_flag_bits ::= irregular(3); }; co_format_flags_set = discriminator, % [ 2 ] type, % [ 2 ] sequence_no % [ 2 ] Finking, et al. Expires August 25, 2005 [Page 48] Internet-Draft ROHC-FN February 2005 { discriminator ::= '01'; abc_flag_bits ::= uncompressed_value(3,7); }; co_format_flags_static = discriminator, % [ 1 ] type, % [ 2 ] sequence_no % [ 2 ] { discriminator ::= '1'; abc_flag_bits ::= static; }; }; The above profile behaves in exactly the same way as the one notated previously, since it has the same meaning. Note that the purposes behind the different formats become clearer with the default encoding methods factored out; all that remains are the encodings which are relevant to each specific format. Note also that default encoding methods which compress down to zero bits have become completely implicit. For example none of the compressed formats mentions "version_no" explicitly, neither the field order list (no need, it's zero bits long) nor the field encodings list (no need it's specified in the default encoding methods). B.7 Control Fields One inefficiency in the compression scheme we have produced thus far is that it uses two bits to provide the LSB encoded sequence number with robustness for the loss of just one packet. In theory only one bit should be needed. The root of the problem is the unusual sequence number that the protocol uses - it counts up in increments of three. In order to encode it at maximum efficiency we need to translate this into a field that increments by one each time. We do this using a control field. Control fields are extra data that are communicated in the compressed packet, which are not direct encodings of fields in the uncompressed header. They can be used to communicate extra information in the compressed packet, which allows other fields to be compressed more efficiently. The control field which we introduce scales the sequence number down by a factor of three. Instead of encoding the original sequence number in the compressed packet, we encode the scaled sequence number, allowing us to have robustness to the loss of one packet by using just one bit of LSB encoding: eg_header === Finking, et al. Expires August 25, 2005 [Page 49] Internet-Draft ROHC-FN February 2005 { uc_format = version_no, % [ 2 ] type, % [ 2 ] flow_id, % [ 4 ] sequence_no, % [ 4 ] abc_flag_bits, % [ 3 ] reserved_flag; % [ 1 ] control_fields = scaled_seq_no; default_methods = { version_no ::= uncompressed_value(2,1); type ::= irregular(2); flow_id ::= static; reserved_flag ::= uncompressed_value(1,0); % need modulo maths to calculate scaling correctly, % due to 4 bit wrap around let(scaled_seq_no:uncomp_value == ((mod(15 - sequence_no:uncomp_value, 3) * 16 + sequence_no:uncomp_value) / 3)); scaled_seq_no ::= lsb(1,-1); }; co_format_irregular = discriminator, % [ 2 ] type, % [ 2 ] flow_id, % [ 4 ] scaled_seq_no, % [ 4 ] abc_flag_bits % [ 3 ] { discriminator ::= '00'; flow_id ::= irregular(4); % overrides default scaled_seq_no ::= irregular(4); % overrides default abc_flag_bits ::= irregular(3); }; co_format_flags_set = discriminator, % [ 2 ] type, % [ 2 ] scaled_seq_no % [ 1 ] { discriminator ::= '01'; abc_flag_bits ::= uncompressed_value(3,7); }; co_format_flags_static = discriminator, % [ 1 ] type, % [ 2 ] scaled_seq_no % [ 1 ] Finking, et al. Expires August 25, 2005 [Page 50] Internet-Draft ROHC-FN February 2005 { discriminator ::= '1'; abc_flag_bits ::= static; }; }; Here is some example output: Uncompressed header: 0101000100010000 Compressed header: 000100010001000 Uncompressed header: 0101000101000000 Compressed header: 1010 ; 000100010100000 Uncompressed header: 0111000101110000 Compressed header: 1110 ; 001100010111000 Uncompressed header: 0111000110101110 Compressed header: 01110 ; 001100011010111 In it's final form, we see that this gives us a saving of a further bit in most packets, reducing the average size of the flow by around 20%. Finking, et al. Expires August 25, 2005 [Page 51] Internet-Draft ROHC-FN February 2005 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Finking, et al. Expires August 25, 2005 [Page 52]