Robust Header Compression                                     R. Finking
Internet-Draft                                        Siemens/Roke Manor
Expires: October 1, 2005                                      C. Bormann
                                                 Universitaet Bremen TZI
                                                            G. Pelletier
                                                             Ericsson AB
                                                          March 30, 2005


        Formal Notation for Robust Header Compression (ROHC-FN)
                 draft-ietf-rohc-formal-notation-07.txt

Status of this Memo

   This document is an Internet-Draft and is subject to all provisions
   of Section 3 of RFC 3667.  By submitting this Internet-Draft, each
   author represents that any applicable patent or other IPR claims of
   which he or she is aware have been or will be disclosed, and any of
   which he or she become aware will be disclosed, in accordance with
   RFC 3668.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on October 1, 2005.

Copyright Notice

   Copyright (C) The Internet Society (2005).

Abstract

   This document defines ROHC-FN: a formal notation to unambiguously
   specify header compression field encodings, when defining new
   profiles within the ROHC (RFC 3095) framework.  ROHC-FN offers a


Finking, et al.          Expires October 1, 2005                [Page 1]

Internet-Draft                   ROHC-FN                      March 2005


   library of encoding methods that are often used in ROHC profiles, and
   can thereby help simplifying future profile development work.

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   3.  Overview of ROHC-FN  . . . . . . . . . . . . . . . . . . . . .  5
     3.1   Scope of ROHC-FN . . . . . . . . . . . . . . . . . . . . .  5
     3.2   Fundamentals of ROHC-FN  . . . . . . . . . . . . . . . . .  6
       3.2.1   Fields and Encodings . . . . . . . . . . . . . . . . .  6
       3.2.2   Structures . . . . . . . . . . . . . . . . . . . . . .  7
     3.3   Example using IPv4 . . . . . . . . . . . . . . . . . . . .  9
   4.  Normative Definition of ROHC-FN  . . . . . . . . . . . . . . . 12
     4.1   Overall Structure of a Specification . . . . . . . . . . . 12
     4.2   Constant Definitions . . . . . . . . . . . . . . . . . . . 13
     4.3   Attributes . . . . . . . . . . . . . . . . . . . . . . . . 13
       4.3.1   Attribute References . . . . . . . . . . . . . . . . . 14
     4.4   Expressions  . . . . . . . . . . . . . . . . . . . . . . . 14
       4.4.1   Integer Literals . . . . . . . . . . . . . . . . . . . 15
       4.4.2   Boolean Literals . . . . . . . . . . . . . . . . . . . 15
       4.4.3   Boolean Operators  . . . . . . . . . . . . . . . . . . 16
       4.4.4   Integer Operators  . . . . . . . . . . . . . . . . . . 16
       4.4.5   Comparison Operators . . . . . . . . . . . . . . . . . 16
     4.5   Comments . . . . . . . . . . . . . . . . . . . . . . . . . 17
     4.6   Encoding Methods Library . . . . . . . . . . . . . . . . . 17
       4.6.1   uncompressed_value . . . . . . . . . . . . . . . . . . 17
       4.6.2   compressed_value . . . . . . . . . . . . . . . . . . . 18
       4.6.3   irregular  . . . . . . . . . . . . . . . . . . . . . . 19
       4.6.4   static . . . . . . . . . . . . . . . . . . . . . . . . 19
       4.6.5   lsb  . . . . . . . . . . . . . . . . . . . . . . . . . 20
       4.6.6   crc  . . . . . . . . . . . . . . . . . . . . . . . . . 21
     4.7   let Statements . . . . . . . . . . . . . . . . . . . . . . 22
     4.8   Profile-specific Encoding Methods  . . . . . . . . . . . . 23
     4.9   Structures . . . . . . . . . . . . . . . . . . . . . . . . 23
       4.9.1   "this" . . . . . . . . . . . . . . . . . . . . . . . . 23
       4.9.2   Simple Structures  . . . . . . . . . . . . . . . . . . 23
       4.9.3   Arguments and Structures . . . . . . . . . . . . . . . 26
       4.9.4   Multiple Formats . . . . . . . . . . . . . . . . . . . 27
       4.9.5   Control Fields . . . . . . . . . . . . . . . . . . . . 31
   5.  Security considerations  . . . . . . . . . . . . . . . . . . . 32
   6.  Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 32
   7.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 32
   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 33
     8.1   Normative References . . . . . . . . . . . . . . . . . . . 33
     8.2   Informative References . . . . . . . . . . . . . . . . . . 33
       Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 33
   A.  Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34


Finking, et al.          Expires October 1, 2005                [Page 2]

Internet-Draft                   ROHC-FN                      March 2005


     A.1   Reserved Keywords  . . . . . . . . . . . . . . . . . . . . 34
     A.2   Characters . . . . . . . . . . . . . . . . . . . . . . . . 35
     A.3   Literals . . . . . . . . . . . . . . . . . . . . . . . . . 36
     A.4   Identifiers  . . . . . . . . . . . . . . . . . . . . . . . 37
     A.5   Operators  . . . . . . . . . . . . . . . . . . . . . . . . 37
     A.6   Expressions  . . . . . . . . . . . . . . . . . . . . . . . 37
     A.7   Constants  . . . . . . . . . . . . . . . . . . . . . . . . 38
     A.8   Field Names  . . . . . . . . . . . . . . . . . . . . . . . 38
     A.9   Attributes . . . . . . . . . . . . . . . . . . . . . . . . 38
     A.10  Encoding Methods . . . . . . . . . . . . . . . . . . . . . 38
     A.11  Structures . . . . . . . . . . . . . . . . . . . . . . . . 39
   B.  Bit-level Worked Example . . . . . . . . . . . . . . . . . . . 39
     B.1   Example Packet Format  . . . . . . . . . . . . . . . . . . 40
     B.2   Initial Encoding . . . . . . . . . . . . . . . . . . . . . 40
     B.3   Basic Compression  . . . . . . . . . . . . . . . . . . . . 41
     B.4   Inter-packet compression . . . . . . . . . . . . . . . . . 42
     B.5   Multiple Packet Formats  . . . . . . . . . . . . . . . . . 44
     B.6   Variable Length Discriminators . . . . . . . . . . . . . . 46
     B.7   Default encoding . . . . . . . . . . . . . . . . . . . . . 49
     B.8   Control Fields . . . . . . . . . . . . . . . . . . . . . . 50
       Intellectual Property and Copyright Statements . . . . . . . . 53


Finking, et al.          Expires October 1, 2005                [Page 3]

Internet-Draft                   ROHC-FN                      March 2005


1.  Introduction

   ROHC-FN is a formal notation designed to help with the definition of
   ROHC [RFC3095] header compression profiles.  ROHC-FN offers a library
   of encoding methods that are often used in ROHC profiles, so new
   profiles can be specified without the need to redefine this library
   from scratch.

   Informally, an encoding method is a function that maps between
   uncompressed data and compressed data.  The simplest encoding methods
   only have one input and one output: the input is an uncompressed
   field and the output is the compressed version of the field.  More
   complex encoding methods can compress multiple fields at the same
   time, e.g.  "list" encoding from [RFC3095], which is designed to
   compress an ordered list of fields.

2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   o  Profile

      A ROHC [RFC3095] profile is a description of how to compress a
      certain protocol stack over a certain type of link.  Each profile
      is built up of packet formats (defining the bits on the wire)
      along with a set of rules that control compressor and decompressor
      behavior.

   o  Field

      With ROHC-FN, the protocol header to be compressed is divided into
      a set of contiguous bit patterns known as fields.  It should be
      noted that how the header is divided into fields is decided by the
      profile designer, and must not necessarily be identical to the one
      given by the specification(s) for the protocol header being
      compressed.

   o  Control field

      Control fields are transmitted from a ROHC compressor to a ROHC
      decompressor, but are not part of the uncompressed header itself.

   o  Encoding method

      Encoding methods are functions that can be applied to compress
      fields in a protocol header.


Finking, et al.          Expires October 1, 2005                [Page 4]

Internet-Draft                   ROHC-FN                      March 2005


   o  Library of encoding methods

      The library of encoding methods contains a number of commonly used
      encoding methods for compressing header fields.

3.  Overview of ROHC-FN

   This section gives an overview of ROHC-FN.  It also explains how
   ROHC-FN can be used to specify the compression of header fields as
   part of a ROHC profile.

3.1  Scope of ROHC-FN

   This section describes the scope of the ROHC-FN.  It explains how the
   formal notation relates to the ROHC framework and to specific ROHC
   profiles.

   The ROHC framework provides the general principles for performing
   ROHC compression.  It defines the concept of a profile, which makes
   ROHC a general platform for different compression schemes.  It sets
   link layer requirements, and in particular negotiation requirements
   for all ROHC profiles.  It defines a set of common functions such as
   Context Identifiers (CIDs), padding and segmentation.  It also
   defines common packet formats (IR, IR-DYN, Feedback, Short-CID
   expander, etc.), and finally it defines a generic, profile
   independent, feedback mechanism.

   A ROHC profile is a description of how to compress a certain protocol
   stack over a certain type of link.  For example, ROHC profiles are
   available for RTP/UDP/IP and many other protocol stacks.

   On a high level each ROHC profile is built up of packet formats
   (defining the bits on the wire) along with a set of rules that
   control compressor and decompressor behavior.  The purpose of the
   packet formats is to define how to compress and decompress headers.
   The packet formats define one or more compressed versions of each
   uncompressed header; inversely, the packet formats define how to
   relate a compressed header back to the original uncompressed header.

   The packet formats will typically define compression of headers
   relative to a context of field values from previous headers in a
   flow, improving the overall compression by taking into account
   redundancies between headers of successive packets.  Therefore, in
   addtion to the packet formats, a profile has to specify how to manage
   these contexts at the compressor and the decompressor, define when
   and what to send in potential feedback messages from decompressor to
   compressor, outline compression strategy principles to make the
   profile robust against bit errors and dropped packets, etc.  All this


Finking, et al.          Expires October 1, 2005                [Page 5]

Internet-Draft                   ROHC-FN                      March 2005


   is needed to ensure that the compressor and decompressor contexts are
   kept synchronised, while still facilitating best possible compression
   performance.

   The ROHC-FN is designed to help in the specification of the packet
   formats used in ROHC profiles.  It offers a library of encoding
   methods for compressing fields, and a mechanism for combining these
   encoding methods to create packet formats tailored to a specific
   protocol stack.  However, the scope of ROHC-FN is limited to
   specifying the packet formats, while all the control logic for the
   profile behavior is to be defined by other means, to form a complete
   profile specification.

3.2  Fundamentals of ROHC-FN

   There are two fundamental elements to the formal notation:

   1.  Fields and their encodings, which define the mapping between a
       field's uncompressed and compressed values.
   2.  Structures, which define lists of uncompressed fields and the
       lists of compressed fields they map onto.

   These two fundamental elements are at the core of the notation and
   are outlined below.

3.2.1  Fields and Encodings

   The creation of bindings between fields and encoding methods is
   indicated as follows:

     field   ::=   encoding_method

   When writing the above statement, the symbol "::=" means "is encoded
   as".  This statement does not represent an assignment operation from
   the right hand side to the left side.  Instead, it is a two-way
   mapping in that it both represents the compression and the
   decompression operation in a single statement, where variables take
   on values through a process of two-way matching.  Two-way matching is
   a binary operation that attempts to make the operands the same
   (similar to the unification process in logic).  The operands
   represent one unspecified data object, and values can be matched from
   either operand.

   Fields have attributes.  Attributes describe various things about the
   field, including the length of the field and whereabouts the field
   appears in the header.  For example:

     field:uncomp_length


Finking, et al.          Expires October 1, 2005                [Page 6]

Internet-Draft                   ROHC-FN                      March 2005


   indicates how long this field is before it is compressed.

   See Section 4.3 for more details on field attributes.

   An encoding method (including the parameters specified with the
   method) creates a reversible binding between the attributes of a
   field .  At the compressor, a packet format can be used if a set of
   bindings that is successful for all fields can be found.  At the
   decompressor, the operation is reversed using the same bindings and
   the fields are filled according to the specified bindings.

   For example, the 'static' encoding method creates a binding between
   the attribute corresponding to the uncompressed value of the field
   and the attribute corresponding to the value of the field in the
   context.

   o  For the compressor, the 'static' binding is successful when both
      the context value and the uncompressed value are the same.  If the
      two values differ then the binding fails.
   o  For the decompressor, the 'static' binding succeeds for a packet
      type only if a valid context entry containing the value of the
      uncompressed field exists.  Otherwise, the binding will fail.

3.2.2  Structures

   Structures provide a mechanism for combining fields and their
   encoding methods into larger units.  Structures are defined using the
   "===" symbol.  These can then be used as encoding methods in other
   structures:

     example_structure ===
     {
       uc_format = field_1,
                   field_2,
                        :
                        :
                   field_n;

       control_fields = ctrl_field_1,
                        ctrl_field_2,
                          :
                          :
                        ctrl_field_n;

       default_methods =
       {
         field_a        ::= encoding_method_9;
         field_e        ::= encoding_method_8;


Finking, et al.          Expires October 1, 2005                [Page 7]

Internet-Draft                   ROHC-FN                      March 2005


           :                   :
           :                   :
         ctrl_field_3   ::= encoding_method_2;
       };

       co_format_0 = field_a,
                       :
                       :
                     field_b
       {
         field_a      ::= encoding_method_1;
            :                     :
            :                     :
         field_b      ::= encoding_method_2;
         ctrl_field_1 ::= encoding_method_3;
       };

       co_format_1 = field_c,
                       :
                       :
                     field_d
       {
         field_c      ::= encoding_method_4;
            :                     :
            :                     :
         field_d      ::= encoding_method_5;
       };
       :
       :
       co_format_n = field_y,
                       :
                       :
                     field_z
       {
         field_y      ::= encoding_method_foo;
            :                     :
            :                     :
         field_z      ::= encoding_method_bar;
       };
     };

   In the example above, the comma separated list "uc_format" indicates
   the order of fields in the uncompressed header.  After this is
   another comma separated list, "control_fields", which defines one or
   more control fields.  Finally, a number of packet formats for the
   compressed data follow, each beginning with the reserved prefix
   "co_format".  These also have a field order list, which consists of:


Finking, et al.          Expires October 1, 2005                [Page 8]

Internet-Draft                   ROHC-FN                      March 2005


   o  fields that occur in the uncompressed header; or
   o  "control fields", that are additional information added to the
      compressed packet during compression.

   In the example, packet formats defined by "co_format" also indicate a
   list of field encodings, which follows immediately after the
   corresponding field order list.  This is typical usage.  A
   "uc_format" may also include a field encodings list, though the one
   in the example doesn't.  The field encodings list contains the
   encoding methods for each field.  These are defined inside braces for
   the fields in the preceding field order list.  Fields that have no
   encoding methods defined in this field order list are encoded using
   the default encodings specified in "default_methods" (see
   Section 4.9.4.3).

   Fields from the uncompressed header have the same name as they do in
   the compressed header.  If there are any fields which are present
   exclusively in the compressed header but which do have an
   uncompressed value, they must be declared in the "control_fields"
   section of the structure (see Section 4.9.5 for more details on
   defining control fields).  In the example above, all fields appearing
   in the compressed header are also found in the uncompressed field
   order list, or the control field list.  However it is possible to
   have fields which appear in neither an uncompressed field order list
   nor the control field list.  Fields which have no "uncompressed"
   value, such as a checksum on the compressed header, fall into this
   category.

3.3  Example using IPv4

   This section gives an overview of how the notation is used by means
   of an example.  The example will develop the formal notation for an
   encoding method capable of compressing a single, well-known header:
   the IPv4 header.

   The first step is to specify the overall structure of the IPv4
   header.  To do this, we use a structure which we will call
   "ipv4_header".  Structures are defined in Section 4.9.  This is
   notated as follows:

     ipv4_header           ===
     {

   The statement above defines the encoding method "ipv4_header" as a
   structure, the definition of which follows the opening brace.

   Definitions within the pair of braces are local to "ipv4_header".
   This scoping mechanism helps to clarify which fields belong to which


Finking, et al.          Expires October 1, 2005                [Page 9]

Internet-Draft                   ROHC-FN                      March 2005


   headers: it is also useful when compressing complex protocol stacks
   with several headers and fields, often sharing the same names.

   The next step is to specify the fields contained in the uncompressed
   IPv4 header.  This is accomplished using ROHC-FN as follows:

       uc_format   =   version,         %[  4 ]
                       header_length,   %[  4 ]
                       tos,             %[  6 ]
                       ecn,             %[  2 ]
                       length,          %[ 16 ]
                       id,              %[ 16 ]
                       reserved,        %[  1 ]
                       dont_frag,       %[  1 ]
                       more_fragments,  %[  1 ]
                       offset,          %[ 13 ]
                       ttl,             %[  8 ]
                       protocol,        %[  8 ]
                       checksum,        %[ 16 ]
                       src_addr,        %[ 32 ]
                       dest_addr;       %[ 32 ]

   The numbers in square brackets give the field width in bits.  Note
   that these are mere comments that do not have any formal meaning.

   The fields contained in the compressed header can then be specified.
   Exactly what appears in this list of fields depends on the encoding
   methods used to encode the uncompressed fields -- it may be possible
   to compress certain fields down to 0 bits, in which case they do not
   need to be sent in the compressed header at all.

       co_format  =  src_addr,        %[ 32 ]
                     dest_addr,       %[ 32 ]
                     length,          %[ 16 ]
                     id,              %[ 16 ]
                     ttl,             %[  8 ]
                     protocol,        %[  8 ]
                     tos,             %[  6 ]
                     ecn,             %[  2 ]
                     dont_frag        %[  1 ]
       {

   Note that the order of the fields in the compressed header is
   independent of the order of the fields in the uncompressed header.

   The next step is to specify the encoding methods for each field in
   the IPv4 header.  These are taken from encoding methods in the
   ROHC-FN library, as well as from additional encoding methods defined


Finking, et al.          Expires October 1, 2005               [Page 10]

Internet-Draft                   ROHC-FN                      March 2005


   in the profile specification itself.  Since the intention here is to
   illustrate the use of the notation, rather than to describe the
   optimum method of compressing IPv4 headers, this example uses only
   three predefined encoding methods.

   The "uncompressed_value" encoding method (defined in Section 4.6.1)
   can compress any field whose uncompressed length and value are fixed.
   No compressed bits need to be sent because the uncompressed field can
   be reconstructed using its known size and value.  The
   "uncompressed_value" encoding method is used to compress five fields
   in the IPv4 header, as described below:

         version             ::=   uncompressed_value(4, 4);
         header_length       ::=   uncompressed_value(4, 5);
         reserved            ::=   uncompressed_value(1, 0);
         more_fragments      ::=   uncompressed_value(1, 0);
         offset              ::=   uncompressed_value(13, 0);

   The first parameter indicates the length of the uncompressed field in
   bits, and the second parameter gives its integer value.

   The "irregular" encoding method (defined in Section 4.6.3) can be
   used to encode any field whose length is fixed, or can be calculated
   using an expression.  It is a general encoding method that can be
   used for fields to which no other encoding method applies.  All of
   the bits in the uncompressed field are present in the compressed
   format as well; hence this encoding does not achieve any compression.

         tos                 ::=   irregular(6);
         ecn                 ::=   irregular(2);
         length              ::=   irregular(16);
         id                  ::=   irregular(16);
         dont_frag           ::=   irregular(1);
         ttl                 ::=   irregular(8);
         protocol            ::=   irregular(8);
         src_addr            ::=   irregular(32);
         dest_addr           ::=   irregular(32);

   Finally, the third encoding method is specific only to IPv4 headers,
   "inferred_ip_v4_header_checksum":

         checksum            ::=   inferred_ip_v4_header_checksum;
       };
     };

   This is a specific encoding method for calculating the IP checksum
   from the rest of the header values.  Like the "uncompressed_value"
   encoding method, no compressed bits need to be sent, since the field


Finking, et al.          Expires October 1, 2005               [Page 11]

Internet-Draft                   ROHC-FN                      March 2005


   value can be reconstructed at the decompressor.

   However, unlike "uncompressed_value", the meaning of
   "inferred_ip_v4_header_checksum" is not defined in the ROHC-FN
   library of encoding methods, nor is it defined by another structure
   elsewhere in the formal notation given in the example above.  Its
   definition can be given either using plain English text or using the
   formal notation as part of the profile definition itself.

   Finally the definition of the structure is closed with a closing
   brace.  At this point, the above example has defined the format of
   the compressed IPv4 header, and provided enough information to allow
   an implementation to construct the compressed header from an
   uncompressed header and vice versa.

4.  Normative Definition of ROHC-FN

   This section gives the normative definition of ROHC-FN.  ROHC-FN is a
   referentially transparent, declarative language with no side effects.

4.1  Overall Structure of a Specification

   A ROHC-FN specification consists of a sequence of zero or more
   constant definitions (Section 4.2), an optional global control field
   list (Section 4.9.5) and one or more encoding method definitions,
   given in the form of structures (Section 4.9).

   Structures define an encoding method by giving one or more formats
   for uncompressed packets and one or more formats for compressed
   packets.  These formats are linked by so-called fields, each of which
   describes a certain part of an uncompressed and/or a compressed
   format.

   The properties of a field are defined by defining an encoding method
   for it and/or by use of "let" statements.  Encoding methods can be
   defined in FN using a structure or can be predefined encoding
   methods.  Predefined encoding methods can be defined in the text
   accompanying a formal specification or they can be those defined in
   the present document.

   Each encoding method and each constant has an identifier.  All of
   these identifiers have global scope.  It is illegal to have multiple
   instances of the same identifier.  It is also illegal to use any of
   the following as identifiers for encoding methods:

   o  "let", "this"
   o  "control_fields", "default_methods"


Finking, et al.          Expires October 1, 2005               [Page 12]

Internet-Draft                   ROHC-FN                      March 2005


   o  "uncomp_hdr_start", "uncomp_length", "uncomp_value"
   o  "comp_hdr_start", "comp_length", "comp_value"
   o  identifiers starting either with "uc_format" or "co_format"

4.2  Constant Definitions

   Constant values can be defined using the "=" operator.  Identifiers
   for constants must be all upper case.  For example:

      SOME_CONSTANT = 3;

   Constants are defined by an expression on the right hand side of the
   "=" operator .  The expression must yield a constant value.  That is,
   the expression must be one whose terms are all either constants or
   literals and not structure parameters or field attributes (see
   Section 4.4).

   Constants have global scope.  Constants must be defined at the top
   level, outside of any structure definition (noting that "=" has a
   different meaning inside a structure see Section 4.9).  Because the
   FN is referentially transparent, constants are entirely equivalent to
   the value they refer to, and are completely interchangeable with that
   value.  Similarly, since the language has no side effects a constant
   may never change its value.

4.3  Attributes

   In ROHC-FN, the properties of a field are defined by an encoding
   method.  The encoding method's formal semantics are specified using a
   set of attributes.  This set of attributes entirely characterises the
   relationship between the uncompressed and compressed representation
   of a field.  Both of these representations are bit strings.

   The notation defines six attributes, three for the uncompressed field
   and a corresponding three for the compressed field.  The attributes
   available for each field are as follows:

   uncompressed attributes of a field:
   o  "uncomp_value", "uncomp_length" and "uncomp_hdr_start",

   compressed attributes of a field:
   o  "comp_value", "comp_length" and "comp_hdr_start".

   The two value attributes contain the respective numerical values of
   the field, i.e.  "uncomp_value" gives the numerical value of the
   uncompressed aspect of the field, and the attribute "comp_value"
   gives the numerical value of the compressed aspect of the field.  The
   numerical values are derived by interpreting the bit string in the


Finking, et al.          Expires October 1, 2005               [Page 13]

Internet-Draft                   ROHC-FN                      March 2005


   field as an unsigned binary number, most-significant bit first.

   The two length attributes indicate the length in bits of the
   associated bit string; "uncomp_length" for the uncompressed
   representation, and "comp_length" for the compressed representation.

   Finally, the two "hdr_start" attributes indicate the offset in bits
   of the start of the field from the start of the header;
   "uncomp_hdr_start" for the position in the uncompressed header, and
   "comp_hdr_start" for the position of the field in the compressed
   header.

   Attributes are undefined unless they are bound to a value in which
   case they become defined.  The defined value of an attribute can not
   be changed, bindings are permanent in the FN.  Defined values are
   required for all compressed attributes of fields which appear in the
   compressed header and for all uncompressed attributes of fields which
   appear in the uncompressed header.  If two conflicting bindings are
   given for a field attribute then the binding fails along with the
   packet format in which the binding was defined.

   Note that uncompressed attributes do not always reflect an aspect of
   the uncompressed header.  Some fields do not originate from the
   uncompressed header, but are control fields.  In particular note that
   the "uncomp_hdr_start" attribute has no useful meaning if the field
   is a control field (see Section 4.9.5).

4.3.1  Attribute References

   Attributes of a particular field are referred to formally by using
   the field's name followed by a ":" and the attribute's identifier.

   For example:

     ip_seq_number:uncomp_value

   gives the uncompressed value of the ip_seq_number field.  The primary
   reason for referencing attributes is for use in expressions, which
   are explained in the following section.

4.4  Expressions

   ROHC-FN includes the usual infix style of expressions, with
   parentheses "(" and ")" used for grouping.  Expressions can be made
   up of any of the components described in the following subsections.

   In summary, the semantics of expressions are generally as in the C
   programming language, with the following additions and exceptions:


Finking, et al.          Expires October 1, 2005               [Page 14]

Internet-Draft                   ROHC-FN                      March 2005


   o  There is no limit on the range of integers.
   o  For modulo, the expression "mod(k,v)" is used instead of C
      language "k % v".  Note that the '%' is a comment character in
      ROHC-FN.
   o  "x ^ y" evaluates to x raised to the power of y.
   o  "log2(w)" evaluates to the smallest integer k where w <= 2^k, i.e.
      it returns the smallest number of bits in which value v can be
      stored.

   Expressions may refer to any of the attributes of each field (as
   described in Section 4.3), and also to any defined constant (see
   Section 4.2).

   If any of the attributes or constants used in the expression are
   undefined, the value of the expression is undefined.  Undefined
   expressions cause the environment (e.g.  the packet format) in which
   they are used to fail if a defined value is required.  Defined values
   are required for all compressed attributes of fields which appear in
   the compressed header and for all uncompressed attributes of fields
   which appear in the uncompressed header.

   Note that expressions cannot be used as encoding methods directly
   because they do not completely characterise an field.  Expressions
   only specify a single value whereas a field is made up of several
   values: its attributes.  For example, the following is illegal:

      tcp_list_length ::= (data_offset + 20) / 4;

   There is only enough information here to define a single attribute of
   "tcp_list_length".  Although this makes no sense formally,
   intuitively, this could be read as definining the "uncomp_value"
   attribute.  However, that would still leave the length of the
   uncompressed field undefined at the decompressor.  Such usage is
   therefore prohibited.

4.4.1  Integer Literals

   Integers can be expressed as decimal values, binary values (prefixed
   by 0b), or hexadecimal values (prefixed by 0x).  Negative integers
   are prefixed by a "-" sign (note that there is no unary minus
   operator).

4.4.2  Boolean Literals

   The boolean literals are "false", which has a value of 0, and "true",
   which has a value of 1.


Finking, et al.          Expires October 1, 2005               [Page 15]

Internet-Draft                   ROHC-FN                      March 2005


4.4.3  Boolean Operators

   The following "boolean" operators are available, which take boolean
   arguments and return a boolean result:

   o  &&, for logical "and".  Returns true if both boolean1 and boolean2
      are true.  Returns false otherwise.
   o  ||, for logical "or".  Returns true if at least one of boolean1 or
      boolean2 is true.  Returns false otherwise.
   o  !, for logical not.  Returns true if boolean is false.  Returns
      false otherwise.

4.4.4  Integer Operators

   The following "integer" operators are available, which take integer
   arguments and return an integer result:

   o  ^, for exponentiation.  "x ^ y" returns the value of "x" to the
      power of "y".
   o  *, / for multiplication and division.  "x * y" returns the product
      of "x" and "y".  "x / y" returns the quotient, rounded down to the
      next integer.
   o  +, - for addition and subtraction.  "x + y" returns the sum of "x"
      and "y".  "x - y" returns the difference.
   o  mod(k, v) for modulo.  "mod(x,y)" returns "x" modulo "y"; x - y *
      (x / y).
   o  log2(w) for logarithm to base 2.  Log2(x) returns the smallest
      integer k where x <= 2^k, i.e.  it returns the smallest number of
      bits in which value x can be stored.

4.4.5  Comparison Operators

   The following "comparison" operators are available, which take
   integer arguments and return a boolean result:

   o  ==, !=, for equality and its negative.  "x == y" returns true if x
      is equal to y.  Returns false otherwise.  "x != y" returns true if
      x is not equal to y.  Returns false otherwise.
   o  <, >, for less than and greater than.  "x < y" returns true if x
      is less than y.  Returns false otherwise.  "x > y" returns true if
      x is greater than y.  Returns false otherwise.
   o  >=, <=, for less than or equal and greater than or equal, the
      inverse functions of <, >.  "x >= y" returns false if x is less
      than y.  Returns true otherwise.  "x <= y" returns false if x is
      greater than y.  Returns true otherwise.


Finking, et al.          Expires October 1, 2005               [Page 16]

Internet-Draft                   ROHC-FN                      March 2005


4.5  Comments

   Free English text can be inserted into a profile definition to
   explain why something has been done a particular way, to clarify the
   intended meaning of the notation, or to elaborate on some point.  To
   this end comment syntax is provided.

   The FN uses an end of line comment style, which makes use of the "%"
   comment character.  Any text between the "%" character and the end of
   the line has no formal meaning.  For example:

     %-----------------------------------------------------------------
     %    IR-REPLICATE packet formats
     %-----------------------------------------------------------------

     % The following fields are included in all of the IR-REPLICATE
     % packet formats:
     %
     uc_format   =   discriminator,    %[  8 ]
                     tcp.seq_number,   %[ 32 ]
                     tcp.flags.ecn,    %[  2 ]

   Comments do not affect the formal meaning of what is notated, but can
   be used to improve readability.  Their use is optional.

   Comments may help to provide clarifications to the reader, and serve
   different purposes to implementers.  Comments should thus not be
   considered of lesser importance when inserting them into the formal
   definition of a profile; these should be consistent with the
   normative part of the profile.

4.6  Encoding Methods Library

   ROHC [RFC3095] contains a number of different techniques for
   compressing header fields (LSB encoding, value encoding, etc.).  Most
   of these techniques are part of the ROHC-FN library so that they can
   be reused when creating new ROHC profiles.  The notation for these is
   described below.  Encoding methods can be defined using structures
   (see Section 4.9).  It is also possible for a profile to define its
   own set of encoding methods using the formal notation or using a
   textual definition.

4.6.1  uncompressed_value

   The "uncompressed_value" encoding method is used to encode header
   fields for which the uncompressed value can be defined using a
   mathematical expression (including constant values):


Finking, et al.          Expires October 1, 2005               [Page 17]

Internet-Draft                   ROHC-FN                      March 2005


     field     ::= uncompressed_value(<uncomp_length_expression>,
                                      <uncomp_value_expression>);

   where the value of the "uncomp_length_expression" binds with the
   field's "uncomp_length" attribute, and the value of the
   "uncomp_value_expression" binds with the field's "uncomp_value"
   attribute.  The "comp_length" attribute is bound to zero since the
   field does not appear in the compressed header.  Note however that it
   is still legal to refer to it in a compressed format field order
   list, but it has a length of zero.  The "comp_value" attribute is not
   bound by this encoding method.

   As an example of the usage of "uncompressed_value" encoding, the IPv6
   header version number is a four bit field that always has the value
   6:

     version   ::=   uncompressed_value(4, 6);

   Another example of value encoding, using an expression to calculate
   the length:

     padding ::= uncompressed_value(nbits - 8, 0);

   Here the expression uses a structure parameter, "nbits" (which
   specifies how many significant bits there are in the data) to
   calculate how many pad bits to use.

4.6.2  compressed_value

   The "compressed_value" encoding method is used to define fields in
   the compressed header for which there is no counter-part in the
   uncompressed header.  It can be used to set compressed fields whose
   value can be defined using a mathematical expression (including
   constant values):

     field     ::= compressed_value(<comp_length_expression>,
                                    <comp_value_expression>);

   where the value of the "comp_length_expression" binds with the
   field's "comp_length" attribute, and the value of the
   "comp_value_expression" binds with the field's "comp_value"
   attribute.  The "uncomp_length" attribute is bound to zero since the
   field does not appear in the uncompressed header.  Note however that
   it is still legal to refer to it in an uncompressed format field
   order list, but it has a length of zero.  The "uncomp_value"
   attribute is not bound by this encoding method.

   One possible use of this encoding method is to define padding in the


Finking, et al.          Expires October 1, 2005               [Page 18]

Internet-Draft                   ROHC-FN                      March 2005


   compressed header:

     pad_to_octet_boundary      ::=   compressed_value(3, 0);

   A more common use is to define a discriminator field to make it
   possible to differentiate between different packet formats within a
   structure.  For convenience, the notation provides syntax for
   specifying value encoding in the form of a binary string.  The binary
   string to be encoded is simply given in single quotes.  For example:

     discriminator     ::=   '01101';

   This has exactly the same meaning as:

     discriminator     ::=   compressed_value(5, 13);


4.6.3  irregular

   The "irregular" encoding method is used to encode a field in the
   compressed packet with a bit pattern identical to the original field
   in the uncompressed packet.  e.g.

     field         ::=   irregular(<expression>);

   where the value of "expression" binds with the "uncomp_length"
   attribute of the field.

   For example, the checksum field of the TCP header is a sixteen bits
   field that does not follow any pattern:

     tcp_checksum  ::=   irregular(16);

   The expression can be used to derive the length of the field from the
   value of another field, and the length does not have to be constant.

4.6.4  static

   The "static" encoding method compresses a field whose length and
   value are the same as for a previous header in the flow, i.e.  where
   the field completely matches an existing entry in the context:

     field            ::=   static;

   The field's "uncomp_value" and "uncomp_length" attributes bind with
   their respective values in the context.

   Since the field value is the same as a previous field value, the


Finking, et al.          Expires October 1, 2005               [Page 19]

Internet-Draft                   ROHC-FN                      March 2005


   entire field can be reconstructed from the context, so it is
   compressed to zero bits and does not appear in the compressed header.

   For example, the source port of the TCP header is a field whose value
   does not change from one packet to the next for a given flow:

     src_port  ::=   static;


4.6.5  lsb

   The Least Significant Bit encoding method, "lsb", compresses a field
   whose value differs by a small amount from the value stored in the
   context.

     field            ::=   lsb(num_lsbs_param, offset_param);

   Here, "num_lsbs_param" is the number of least significant bits to
   use, and "offset_param" is the interpretation interval offset.  The
   parameter "num_lsbs_param" binds with the "comp_length" attribute,
   and the "uncomp_value" attribute binds with (context_value -
   offset_param + comp_value).

   The "lsb" encoding method can compress a field whose value lies
   between (context_value - offset_param) and (context _value -
   offset_param + 2^num_lsbs_param - 1) inclusively.  In particular, if
   offset_param = 0 then the field value can only stay the same or
   increase relative to the previous header in the flow.  If
   offset_param = -1 then it can only increase, whereas if offset_param
   = 2^num_lsbs_param then it can only decrease.

   The compressor may not be capable to determine the exact context
   value that will be used by the decompressor, since some packets that
   would have updated the context may have been lost or damaged.
   However, from feedback received or by making assumptions, the
   compressor can limit the candidate set of values.  The compressor
   then chooses an encoding such that no matter which context value in
   the candidate set the decompressor uses, the resulting decompression
   is correct.  If that is not possible, the lsb encoding method fails
   (which typically results in a less efficient packet format being
   chosen by the compressor).  As "reasonable" assumptions may not
   always be correct, lsb encoding is intended to be used in conjunction
   with methods that validate the output of the decompression process,
   such as the crc method described in Section 4.6.6.

   The compressed field takes up the specified number of bits in the
   compressed header (i.e.  num_lsbs_param).


Finking, et al.          Expires October 1, 2005               [Page 20]

Internet-Draft                   ROHC-FN                      March 2005


   For example, the tcp sequence number:

     tcp_sequence_number   ::=   lsb(14, 8192);

   See the ROHC specification [RFC3095] for additional details on LSB
   encoding, where the parameter "k" corresponds to the parameter
   "num_lsbs_param" and where interpretation interval offset "p"
   corresponds to the parameter "offset_param".

4.6.6  crc

   The "crc" encoding method provides a CRC calculated over a block of
   data.  The block of data is represented using either the
   "uncomp_value" or "comp_value" attribute of a field.  The "crc"
   method takes a number of parameters:

   o  the number of bits for the CRC (crc_bits),
   o  the bit-pattern for the polynomial (bit_pattern),
   o  the initial value for the CRC register (initial_value),
   o  the value of the block of data (block_data_value); and
   o  the size in octets of the block of data (block_data_length).

   I.e.:

     field   ::=   crc(num_bits, bit_pattern, initial_value,
                       block_data_value, block_data_length);

   The CRC is calculated in least significant bit (LSB) order.

   The following CRC polynomials are defined in [RFC3095], in Sections
   5.9.1 and 5.9.2:

      8-bit
         C(x) = x^0 + x^1 + x^2 + x^8
         bit_pattern = 0xe0

      7-bit
         C(x) = x^0 + x^1 + x^2 + x^3 + x^6 + x^7
         bit_pattern = 0x79

      3-bit
         C(x) = x^0 + x^1 + x^3
         bit_pattern = 0x06

   For example:

     % 3 bit CRC, C(x) = x^0 + x^1 + x^3
     crc_field ::= crc(3, 0x6, 0xF, this:comp_value, this:comp_length);


Finking, et al.          Expires October 1, 2005               [Page 21]

Internet-Draft                   ROHC-FN                      March 2005


4.7  let Statements

   A "let" statement is like a minature encoding method.  Whereas an
   encoding method binds several field attributes at once, a let
   statement typically binds just one of them.  In fact all encoding
   methods can be expressed in terms of a collection of let statements.
   Here is an example let statement which binds the "uncompressed_value"
   attribute of a field to 5.

     let(field:uncompressed_value == 5);

   Like an encoding method, a let statement can only be successfully
   used in a format if the binding it describes is acheivable.  A format
   containing the example let statement above would not be useable if
   the field had also been bound with "uncompressed_value" encoding
   which gave it a different uncompressed value.

   A "let" statement takes a boolean expression as a parameter.  It can
   be used to assert that the expression has a specific value, in order
   to choose a particular packet format from a list of possible formats
   specified in a structure (see Section 4.9), or just to bind an
   expression as in the example above.  The general form of a let
   statement is therefore:

     let(<boolean expression>)

   A "let" statement must only be used inside a field encodings list
   (see Section 4.9).

   There are three possible results when an expression is asserted in a
   let statement:
   o  The boolean expression evaluates to false, in which case the
      packet format which contains the let statement can not be used,
   o  The boolean expression evaluates to true, in which case the packet
      format is useable,
   o  Some or all of the terms in the boolean expression are undefined.
      In this case, the outcome depends on whether there exists a set of
      values that the undefined terms could take that would make the
      expression evaluate to true.  In this case the undefined terms
      become bound by the expression and the packet format can be used.

   The first example in this section is an example of the third type of
   result.  Generally speaking a let statement is either being used in
   this fashion (as an assignment) or else it is being used to test if a
   particular packet format is useable, as is the case with the first
   two types of result.

   "let" is a reserved word.


Finking, et al.          Expires October 1, 2005               [Page 22]

Internet-Draft                   ROHC-FN                      March 2005


4.8  Profile-specific Encoding Methods

   The library of encoding methods defined by ROHC-FN provides a basic
   and generic set of field encoding methods.  When using ROHC-FN in a
   ROHC profile specification, some additional encodings specific to the
   particular protocol header being compressed may however be needed,
   such as methods that infer the value of a field from other values.
   These methods are specific to the properties of the protocol being
   compressed, and will thus have to be defined within the profile
   specification itself.  Such profile-specific encoding methods,
   defined either in ROHC-FN syntax or rigorously in plain text, can be
   refered to in a ROHC-FN defintion of the profile's packet formats in
   the same way as any other method in the ROHC-FN library (see
   Section 4.6).

4.9  Structures

   Structures are used for defining new encoding methods in a formal
   specification.  They compose groups of individual fields into
   contiguous blocks.  Structures can be thought of as compound encoding
   methods; they have names and may have parameters and can be used in
   the same way as any other encoding method.  Since structures can
   contain references to other structures, complicated headers can be
   broken down into manageable pieces.

   This section describes the various features of structures, starting
   out with the simplest.

4.9.1  "this"

   Within a structure it is possible to refer to the field it is
   encoding, using the keyword "this".  This is useful for gaining
   access to the attributes of the field being encoded.  For example it
   is often useful to know the total uncompressed length of the header
   which is being encoded.

4.9.2  Simple Structures

   A structure can be used to specify a single fixed encoding.  This is
   its simplest form.  For example:


Finking, et al.          Expires October 1, 2005               [Page 23]

Internet-Draft                   ROHC-FN                      March 2005


     compound_encoding_method ===
     {
       uc_format   =   field_1, %[  4 ]
                       field_2; %[ 12 ]

       co_format   =   field_2, %[  0 ]
                       field_1  %[  4 ]
       {
         field_1   ::=   irregular(4);
         field_2   ::=   uncompressed_value(12, 9);
       };
     };

   The above begins with the structure's identifier,
   "compound_encoding_method".  The identifier is followed by "===",
   which indicates that this is a structure definition.  The definition
   of the structure then follows inside curly braces, "{" and "}".  The
   first item in the definition is the "uc_format" field order list,
   which gives the order of the fields in the uncompressed header.  This
   is followed by the compressed header field order list.  This list is
   in turn followed by the field encodings list for the compressed
   header, which gives the encoding method for each field.  The
   different components of this example are described in more detail
   below.

4.9.2.1  Uncompressed Format

   The uncompressed field order list is defined by "uc_format", which
   specifies the fields of the uncompressed header in the order that
   they appear in the uncompressed header.  In the example, this is
   "field_1" followed by "field_2".  This means that a field being
   encoded by this structure is divided into two subfields, "field_1"
   and "field_2".  The total uncompressed lengths of these two fields
   therefore equals the length of the field being encoded.  Formally:

     field_1:uncomp_length + field_2:uncomp_length == this:uncomp_length

   In the example we have just two fields but any number of subfields
   may be used.  This relationship applies to however many fields are
   actually used.Note that the arrangement of fields specified in the
   uncompressed field order list is up to the notator.  Any arrangement
   of fields that correctly describes the content of the uncompressed
   header may be chosen -- this need not be the same as the one
   described in the specifications for the protocol header being
   compressed.  However, the bits of the uncompressed format must remain
   in the same order.

   For example, there may be a protocol whose header contains a 16 bits


Finking, et al.          Expires October 1, 2005               [Page 24]

Internet-Draft                   ROHC-FN                      March 2005


   sequence number, but whose sessions tend to be short lived.  This
   would mean that the high bits of the sequence number are almost
   always constant.  The "uc_format" could reflect this by splitting the
   original uncompressed field into two fields, one field to represent
   the insignificant almost-always-zero part of the sequence number, and
   a second field to represent the significant part.

   An uncompressed format may contain a field encodings list.  Encoding
   methods specified therein are used whenever a packet with that
   uncompressed format is being encoded.  The encoding of a packet with
   a given uncompressed format can only succeed if all of its encoding
   methods and let statements succeed (see Section 4.7).

   The total length of an uncompressed header must be defined.  The
   length of each of the fields in an uncompressed header must also be
   defined.  This means that the bindings in the "uc_format",
   "co_format" and "default_methods" (see below) field encodings lists
   must between them define the "uncomp_length" attribute of evey field
   in an uncompressed header so that there is an unambiguous mapping
   from the bits in the uncompressed header to the fields listed in each
   "uc_format" field order list.

4.9.2.2  Compressed Format

   Similar to the uncompressed field order list, the compressed data
   will appear in the order specified by the compressed field order list
   given for a compressed format.  Each individual field is encoded in
   the manner given for that field in the field encodings list, which is
   in braces and follows immediately after the compressed field order
   list.  The total length of the compressed data will be the total of
   the compressed lengths of all the individual fields.  The annotation
   for these fields indicates that they are zero and 4 bits long, making
   a total of 4 bits.

   Note that the order of the fields specified in a compressed format
   field order list, does not have to match the order they appear in the
   "uc_format" field order list.  It may be desirable to reorder the
   fields in the compressed header for alignment the compressed header
   to the octet boundary, or for other reasons.  In the above example,
   the order is in fact the opposite of that in the uncompressed header.

   The field encodings list specifies that the encoding for "field_1",
   is "irregular", which takes up four bits in both the compressed
   header and uncompressed header.  The encoding for "field_2" is
   "uncompressed_value", which means that the field has a fixed value,
   so it can be compressed to zero bits.  The value it takes is 9, and
   it is 12 bits wide in the uncompressed header.


Finking, et al.          Expires October 1, 2005               [Page 25]

Internet-Draft                   ROHC-FN                      March 2005


   Fields like "field_2", which compress to zero bits in length, may be
   omitted from the compressed field order list.  This is because their
   position in the list is not significant.  So, without changing the
   meaning, the above example could be notated as follows:

     compound_encoding_method ===
     {
       uc_format  =   field_1,  %[  4 ]
                      field_2;  %[ 12 ]

       co_format  =   field_1   %[  4 ]
       {
         field_1   ::=   irregular(4);
         field_2   ::=   uncompressed_value(12, 9);
       };
     };

   The total length of a compressed header must be defined.  The length
   of each of the fields in a compressed header must also be defined.
   This means that the bindings in the "uc_format", "co_format" and
   "default_methods" (see below) field encodings lists must between them
   define the "comp_length" attribute of evey field in a compressed
   header so that there is an unambiguous mapping from the bits in the
   compressed header to the fields listed in each "co_format" field
   order list.

4.9.3  Arguments and Structures

   Structures may take arguments, which have some control over the
   mapping between compressed and uncompressed fields.  These are
   specified immediately after the structure name, in parentheses, as a
   comma separated list.  For example:

     poor_mans_lsb(variable_length) ===
     {
       uc_format   =   constant_bits,
                       variable_bits;

       co_format   =   variable_bits
       {
         constant_bits  ::=   static;
         variable_bits  ::=   irregular(variable_length);
       };
     };

   As with any encoding method, all arguments are values, rather than
   fields.  Although entire fields cannot be passed as arguments, it is
   possible to pass their attributes instead.


Finking, et al.          Expires October 1, 2005               [Page 26]

Internet-Draft                   ROHC-FN                      March 2005


4.9.4  Multiple Formats

   Structures can also define multiple formats for a given header.  This
   allows different compression methods to be used depending on what is
   the most efficient way of compressing a particular header.

   For example, a field may have a fixed value most of the time, but the
   fixed value may occasionally change.  Using a single format for the
   structure, this field would have to be encoded using "irregular" (see
   Section 4.6.3), even though the value only changes rarely.  However,
   by using the structure to define multiple formats, we can provide two
   alternative encodings; one for when the value remains fixed and
   another for when the value changes.

   This is the topic of the following sub-sections.

4.9.4.1  Naming Convention

   When compressed formats are defined, they must be defined using names
   beginning with the reserved prefix "co_format".  Similarly
   uncompressed formats must be defined using names beginning with
   "uc_format".

   Format names must be unique within the structure to which they
   belong.

4.9.4.2  Format Discrimination

   Each of the compressed formats has its own field order list and field
   encodings list.  A compressor may pick any of these alternative
   formats to compress a header, as long the field encodings it employs
   can be used with the uncompressed header.  For example, the
   compressor could not choose to use a compressed format that had a
   "static" encoding for a field whose value had just changed.

   More formally, the compressor can choose any combination of an
   uncompressed format and a compressed format for which all fields
   "succeed", i.e.  the encoding methods and let-statements succeed (see
   Section 4.7).  If there are multiple successful combinations, the
   compressor can choose any one.  Otherwise if there is no successful
   combination, the encoding method defined by the structure "fails".

   Because the compressor has a choice, it must be possible for the
   decompressor to discriminate between the different packet formats.  A
   simple approach to this problem is for each compressed format to
   include a "discriminator" that uniquely identifies that particular
   "co_format".  A discriminator is a control field; it is not derived
   from any of the uncompressed field values (see Section 4.6.2).


Finking, et al.          Expires October 1, 2005               [Page 27]

Internet-Draft                   ROHC-FN                      March 2005


4.9.4.3  Default Encoding Methods - default_methods

   When using multiple packet formats, default bindings may be specified
   for each field or attribute.  The default encoding methods specify
   the encoding method to use for a field if no encoding method is given
   for that field elsewhere.  This is helpful to keep the definition of
   the packet formats concise, as the same encoding method need not be
   repeated for every format.

   The syntax for specifying default bindings is similar to that used to
   specify a compressed or uncompressed format.  However there is no
   field order list for the default encoding methods, only the field
   encodings list is given.  This is because the field order is
   specified individually for each "co_format" and "uc_format".  For
   example:

     default_methods =
     {
       field_1           ::=   uncompressed_value(4,1);
       field_2           ::=   uncompressed_value(4,2);
       field_3           ::=   lsb(3,-1);
       let(field_4:uncomp_length == 4);
     };

   Here default bindings are specified for fields 1 to 3.  A default
   binding for the "uncomp_length" attribute of field 4 is also
   specified.

   Fields for which there is a default encoding method do not need to be
   specified in the field encodings list of any format that uses the
   default encoding method for that field.  The default encoding method
   for a field may be overridden by specifying explicitly an encoding
   method for that field.  If a default encoding method is not
   overridden, and that encoding method always compresses the field down
   to zero bits, then the field can also be omitted from the compressed
   format field order list, since, like any other zero bit field, its
   position in the field order list is not significant.

   The field encodings list of default_methods may also contain default
   bindings for individual attributes by using "let" statements.  If a
   default binding is given for an individual attribute, that binding
   may be overridden by another binding for that attribute or for the
   field to which it belongs.  The overriding binding may either be
   another let statement, or an encoding method.  Assuming the default
   methods given in the example above, the first three of the following
   four compressed packet formats would override the default binding for
   "field_4:uncomp_length":


Finking, et al.          Expires October 1, 2005               [Page 28]

Internet-Draft                   ROHC-FN                      March 2005


     co_format_1 = field_4
     {
       let(field_4:uncomp_length == 3); % set uncomp_length to 3
     };

     co_format_2 = field_4
     {
       field_4           ::=   irregular(3); % set uncomp_length to 3
     };

     co_format_3 = field_4
     {
       field_4           ::=   '1010'; % set uncomp_length to undefined
     };

     co_format_4 = field_4
     {
       let(field_4:uncomp_value == 12); % use default uncomp_length
     };

   It is allowed to override one default binding but still use another.
   Overriding one default binding does not imply that other default
   bindings are also being overridden.  It is also allowed to supply
   default bindings for some but not all fields.

   Note that a structure's default methods are only consulted for packet
   formats which do not already specify an encoding for all of their
   fields.  For the packet formats that do use the default methods, only
   those fields whose encoding methods are not specified are looked up
   in the default methods.

4.9.4.4  Example of Multiple Formats

   Putting this altogether, here is a complete example of a structure
   with multiple compressed formats:


Finking, et al.          Expires October 1, 2005               [Page 29]

Internet-Draft                   ROHC-FN                      March 2005


     test_multiple_formats  ===
     {
       uc_format   =   field_1,    %[  4 ]
                       field_2,    %[  4 ]
                       field_3;    %[ 24 ]

       default_methods =
       {
         field_1           ::=   static;
         field_2           ::=   uncompressed_value(4, 2);
         field_3           ::=   lsb(4, 0);
       };

       co_format_0   =   discriminator,    %[ 1 ]
                         field_3           %[ 4 ]
       {
         discriminator     ::=   '0';
       };

       co_format_1   =   discriminator,    %[  1 ]
                         field_1,          %[  4 ]
                         field_3           %[ 24 ]
       {
         discriminator     ::=   '1';
         field_1           ::=   irregular(4);
         field_3           ::=   irregular(24);
       };
     };

   Note the following:
   o  "field_1" and "field_3" both have default encoding methods
      specified for them, which are used in "format_0", but is
      overridden in "format_1"; "field_2" however is not overridden.
   o  "field_1" and "field_2" have default encoding methods which
      compress to zero bits.  When these are used in "co_format_0", the
      field names do not appear in either the field order list or in the
      field encodings list.
   o  "field_3" has an encoding method which does not compress to zero
      bits, so whilst "field_3" is absent from the field encoding list
      of "format_0"', it still needs to appear in the field order list
      to specify whereabouts it goes in the compressed packet.
   o  in the example, all the uncompressed header fields have default
      encoding methods specified for them, but this is not a
      requirement.  It is perfectly allowable to only specify default
      encodings for some or even none of the uncompressed header fields.
   o  in the example all the default encoding methods are on fields from
      the uncompressed header, but this is not a requirement.  It is
      also perfectly allowable to specify default encoding methods for


Finking, et al.          Expires October 1, 2005               [Page 30]

Internet-Draft                   ROHC-FN                      March 2005


      control fields.

4.9.5  Control Fields

   Control fields are defined using the "control_fields" list.  The
   control fields list specifies all fields that do not appear in the
   uncompressed header but which have an uncompressed value
   (specifically those with a non-zero uncomp_length).  Such fields may
   be used to help compress fields from the uncompressed header more
   efficiently.  A control field could be used to improve efficiency by
   representing some commonality between a number of the uncompressed
   fields, or by representing some information about the flow that is
   not explicitly contained in the protocol headers.

   For example in IP, the behaviour of the IP-ID field in a flow varies
   depending on how the endpoints handle IP-IDs.  Sometimes the
   behaviour is effectively random, sometimes the IP-ID follows a
   predictable sequence, and at other times it stays fixed at zero.  The
   type of IP-ID behaviour is information that is never communicated
   explicitly in the uncompressed header.  However, a ROHC profile can
   still be designed to identify the behavior and adjust the compression
   strategy according to the identified behaviour, thereby improving the
   compression performance.  To do so, the profile can introduce an
   explicit field to communicate the IP-ID behaviour in compressed
   headers, and in ROHC-FN terms this is done by introducing a control
   field:
   ipv4 ===
   {
           uc_format =   version,     %[ 4 ]
                         hdr_length,  %[ 4 ]
                         protocol,    %[ 8 ]
                         tos_tc,      %[ 6 ]
                         ip_ecn_flags,%[ 2 ]
                         ttl_hopl,    %[ 8 ]
                         df,          %[ 1 ]
                         mf,          %[ 1 ]
                         rf,          %[ 1 ]
                         frag_offset, %[ 13 ]
                         ip_id,       %[ 16 ]
                         src_addr,    %[ 32 ]
                         dst_addr,    %[ 32 ]
                         checksum,    %[ 16 ]
                         length;      %[ 16 ]

           control_fields  = ip_id_behavior;    %[ 2 ]
                 :
                 :
   };


Finking, et al.          Expires October 1, 2005               [Page 31]

Internet-Draft                   ROHC-FN                      March 2005


   The control_fields list is equivalent to the "uc_format" field order
   list for fields that do not appear in the uncompressed header, that
   is it defines a field that has the same properties (the same
   attributes etc) as fields appearing in the uncompressed header.

   Control fields are initialised by using the appropriate encoding
   methods and/or by using let statements.  For example:

   example_struct ===
   {
     uc_format = field_1;

     control_fields = scaled_field;

     co_format = scaled_field
     {
       let(scaled_field:uncomp_value  == field_1:uncomp_value / 8);
       let(scaled_field:uncomp_length == field_1:uncomp_length - 3);
       scaled_field ::= lsb(4, 0);
     };
   };

   This control field is used to scale down a field in the uncompressed
   header by a factor of 8 before encoding it with LSB encoding.
   Scaling it down makes the LSB encoding more efficient.

   Control fields may also be used with global scope.  In this case
   their declaration must be outside of any structure.  They are then
   visible within any structure thus allowing information to be shared
   between structures directly.

5.  Security considerations

   This draft describes a formal notation similar to ABNF [RFC2234], and
   hence is not believed to raise any security issues.

6.  Contributors

   Although no longer listed as an author, Richard Price did almost all
   of the foundational work on the formal notation and also produced the
   original formal notation internet draft on which this document is
   based.  Many thanks to him for doing that groundwork on which this
   document stands.

7.  Acknowledgements

   A number of important concepts and ideas have been borrowed from ROHC
   [RFC3095].


Finking, et al.          Expires October 1, 2005               [Page 32]

Internet-Draft                   ROHC-FN                      March 2005


   Thanks to Lars-Erik Jonsson for his extensive and comprehensive
   review comments and for supplying alternative text to problematic
   parts of the document.

   Thanks to Mark West, Eilert Brinkmann and particularly Kristofer
   Sandlund for their cooperation and feedback from notating the TCP
   profile, and also for their review comments.

   Thanks to Rob Hancock and Stephen McCann for early work on the formal
   notation.  The authors would also like to thank Christian Schmidt,
   Qian Zhang, Hongbin Liao and Max Riegel for their comments and
   valuable input.

   Finally thanks to Caroline Daniels and Alan Finney for doing some
   excellent last minute review work.

8.  References

8.1  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

8.2  Informative References

   [RFC2234]  Crocker, D. and P. Overall, "Augmented BNF for Syntax
              Specifications: ABNF", RFC 2234, November 1997.

   [RFC3095]  Bormann, C., Burmeister, C., Degermark, M., Fukushima, H.,
              Hannu, H., Jonsson, L-E., Hakenberg, R., Koren, T., Le,
              K., Liu, Z., Martensson, A., Miyazaki, A., Svanbro, K.,
              Wiebke, T., Yoshimura, T. and H. Zheng, "RObust Header
              Compression (ROHC): Framework and four profiles: RTP, UDP,
              ESP, and uncompressed", RFC 3095, July 2001.


Authors' Addresses

   Robert Finking
   Siemens/Roke Manor
   Roke Manor Research Ltd.
   Romsey, Hampshire  SO51 0ZN
   UK

   Phone: +44 (0)1794 833189
   Email: robert.finking@roke.co.uk
   URI:   http://www.roke.co.uk


Finking, et al.          Expires October 1, 2005               [Page 33]

Internet-Draft                   ROHC-FN                      March 2005


   Carsten Bormann
   Universitaet Bremen TZI
   Postfach 330440
   Bremen  D-28334
   Germany

   Phone: +49 421 218 7024
   Fax:   +49 421 218 7000
   Email: cabo@tzi.org


   Ghyslain Pelletier
   Ericsson AB
   Box 920
   Lulea  SE-971 28
   Sweden

   Phone: +46 (0) 8 404 29 43
   Email: ghyslain.pelletier@ericsson.com

Appendix A.  Syntax

   This section gives a formal definition of the ROHC-FN syntax in ABNF
   (see [RFC2234]).

A.1  Reserved Keywords

   Some keywords are defined and reserved in ROHC-FN.  These keywords
   cannot be reused as identifiers by the notator.

   o  co_format - struct syntax
   o  comp_hdr_start - attribute
   o  comp_length - attribute
   o  comp_value - attribute
   o  compressed_value - primitive encoding method
   o  default_methods - struct syntax
   o  irregular - primitive encoding method
   o  let - primitive encoding method
   o  lsb - primitive encoding method
   o  static - primitive encoding method
   o  uc_format - struct syntax
   o  uncomp_hdr_start - attribute
   o  uncomp_length - attribute
   o  uncomp_value - attribute
   o  uncompressed_value - primitive encoding method

   reserved_word ::= primitive_encoding_method_name |
   attribute_identifier | struct_reserved_words


Finking, et al.          Expires October 1, 2005               [Page 34]

Internet-Draft                   ROHC-FN                      March 2005


A.2  Characters

   Because ABNF [RFC2234] symbols are case insensitive, it is necessary
   to define explicit symbols for each of the lower case characters
   which we use in the reserved words of our grammar.  Fortunately there
   are no fundamental components of the FN syntax which are in upper
   case, otherwise we would have to define each capital letter
   separately also.

   a = %x61

   b = %x62

   c = %x63

   d = %x64

   e = %x65

   f = %x66

   g = %x67

   h = %x68

   i = %x69

   j = %x6a

   k = %x6b

   l = %x6c

   m = %x6d

   n = %x6e

   o = %x6f

   p = %x70

   q = %x71

   r = %x72

   s = %x73

   t = %x74


Finking, et al.          Expires October 1, 2005               [Page 35]

Internet-Draft                   ROHC-FN                      March 2005


   u = %x75

   v = %x76

   w = %x77

   x = %x78

   y = %x79

   z = %x7a

   lower-case-letter = %x61-7a ; a-z

   upper-case-letter = %x41-5a ; A-Z

   binary-digit = "0" / "1"

   octal-digit = binary-digit / "2" / "3" / "4" / "5" / "6" / "7"

   decimal-digit = octal-digit / "8" / "9"

   hexadecimal-digit = decimal-digit / %x61-66

   open-bracket = "("

   close-bracket = ")"

   open-brace = "{"

   close-brace = "}"

   equals-sign = "="

   underscore = "_"

   comma = ","

   semi-colon = ";"

   single-quote = "'"

A.3  Literals

   decimal-literal = 1*decimal-digit

   binary-literal = "0".b 1*binary-digit


Finking, et al.          Expires October 1, 2005               [Page 36]

Internet-Draft                   ROHC-FN                      March 2005


   octal-literal = "0".o 1*octal-digit

   hexadecimal-literal = "0".x 1*hexadecimal-digit

   numeric-literal = decimal-literal / binary-literal / octal-literal /
   hexadecimal-literal

   boolean-literal = t.r.u.e / f.a.l.s.e

A.4  Identifiers

   lower-case-identifier = (lower-case-letter *(lower-case-letter /
   decimal-digit / underscore)) ; The original EBNF had "-
   reserved-word" here, meaning "except reserved words", but ABNF has no
   equivalent construct.  Notwithstanding this fact, any automated tool
   should enforce the reservation of reserved words in this fashion.

   upper-case-identifier = upper-case-letter *(upper-case-letter /
   decimal-digit / underscore)

A.5  Operators

   exponential-operator = "^"

   multiplicative-operator = "*" / "/"

   additive-operator = "+" / "-"

   unary-minus = "-"

A.6  Expressions

   parenthesised-expression = open-bracket arithmetic-expression
   close-bracket

   primitive-expression = numeric-literal / constant-name /
   field-attribute / parenthesised-expression / (unary-minus
   primitive-expression)

   exponential-expression = primitive-expression *(exponential-operator
   primitive-expression)

   multiplicative-expression = exponential-expression
   *(multiplicative-operator exponential-expression)

   additive-expression = multiplicative-expression *(additive-operator
   multiplicative-expression)


Finking, et al.          Expires October 1, 2005               [Page 37]

Internet-Draft                   ROHC-FN                      March 2005


   arithmetic-expression = additive-expression

A.7  Constants

   constant-name = upper-case-identifier

   constant-value = constant-name / expression

   constant-definition = constant-name equals-sign constant-value

A.8  Field Names

   field-name = lower-case-identifier

   annotated-field-name = field-name [ "[" constant "]" ]

A.9  Attributes

   attribute-category = (c.o.m.p) / (u.n.c.o.m.p)

   attribute-name = (l.e.n.g.t.h) / (v.a.l.u.e) /
   (h.d.r.underscore.s.t.a.r.t)

   attribute-identifier = attribute-category underscore attribute-name

   field-attribute = field-name ":" attribute-identifier

A.10  Encoding Methods

   primitive-encoding-method-name =
   (c.o.m.p.r.e.s.s.e.d.underscore.v.a.l.u.e) / (i.r.r.e.g.u.l.a.r) /
   (l.s.b) / (s.t.a.t.i.c) /
   (u.n.c.o.m.p.r.e.s.s.e.d.underscore.v.a.l.u.e)

   uncompressed-value-shorthand = single-quote *binary-digit
   single-quote

   external-encoding-method-name = underscore lower-case-identifier

   non-primitive-encoding-method-name = structure-name /
   external-encoding-method-name

   encoding-method-parameter-list = open-bracket arithmetic-expression
   *(comma arithmetic-expression) close-bracket

   encoding-method = uncompressed-value-shorthand /
   (encoding-method-name [encoding-method-parameter-list])


Finking, et al.          Expires October 1, 2005               [Page 38]

Internet-Draft                   ROHC-FN                      March 2005


   field-encoding = field-name "::=" encoding-method

A.11  Structures

   structure-name = lower-case-identifier

   field-order-list = [ annotated-field-name *(comma
   annotated-field-name) ]

   field-encodings-list = open-brace *(field-encoding semi-colon)
   close-brace

   uncompressed-format-prefix =
   (u.n.c.o.m.p.r.e.s.s.e.d.underscore.f.o.r.m.a.t)

   uncompressed-format = uncompressed-format-prefix [underscore
   lower-case-identifier] equals-sign field-order-list; semi-colon

   compressed-format-prefix =
   (c.o.m.p.r.e.s.s.e.d.underscore.f.o.r.m.a.t)

   compressed-format = compressed-format-prefix [underscore
   lower-case-identifier] equals-sign field-order-list
   field-encodings-list semi-colon

   default-methods-id ::= (d.e.f.a.u.l.t.underscore.m.e.t.h.o.d.s)

   default-methods = default-methods-id equals-sign field-encodings-list
   semi-colon

   uncompressed-format-list = *uncompressed-format

   compressed-format-list = 1*compressed-format

   structure-body = open-brace uncompressed-format-list
   [default-methods] compressed-format-list close-brace

   structure-definition = structure-name "===" structure-body semi-colon

   struct-reserved-words = uncompressed-format-prefix /
   compressed-format-prefix / default-methods-id;

Appendix B.  Bit-level Worked Example

   This section gives a worked example at the bit level, showing how a
   simple profile describes the compression of real data from an
   imaginary protocol header.  The example used has been kept fairly
   simple, whilst still aiming to illustrate some of the intricacies


Finking, et al.          Expires October 1, 2005               [Page 39]

Internet-Draft                   ROHC-FN                      March 2005


   that arise in use of the notation.  In particular, fields have been
   kept short to make it possible to read the binary representation of
   the headers by eye, without too much difficulty.

B.1  Example Packet Format

   Our imaginary header is just 16 bits long, and consists of the
   following fields:

   1.  version number - 2 bits
   2.  type - 2 bits
   3.  flow id - 4 bits
   4.  sequence number - 4 bits
   5.  flag bits - 4 bits

   So for example 0101000100010000 indicates a packet with a version
   number of one, a type of one, a flow id of one, a sequence number of
   one, and all flag bits set to zero.

B.2  Initial Encoding

   An initial definition based solely on the above information is:

   eg_header ===
   {
     uc_format             =   version_no,    %[ 2 ]
                               type,          %[ 2 ]
                               flow_id,       %[ 4 ]
                               sequence_no,   %[ 4 ]
                               flag_bits;     %[ 4 ]

     co_format_initial     =   version_no,    %[ 2 ]
                               type,          %[ 2 ]
                               flow_id,       %[ 4 ]
                               sequence_no,   %[ 4 ]
                               flag_bits      %[ 4 ]
     {
       version_no          ::=   irregular(2);
       type                ::=   irregular(2);
       flow_id             ::=   irregular(4);
       sequence_no         ::=   irregular(4);
       flag_bits           ::=   irregular(4);
     };
   };

   This defines the packet format nicely, but doesn't actually offer any
   compression.  If we use it to encode the above header, we get:


Finking, et al.          Expires October 1, 2005               [Page 40]

Internet-Draft                   ROHC-FN                      March 2005


     Uncompressed header: 0101000100010000
     Compressed header:   0101000100010000

   This is because we have stated that all fields are irregular - i.e.
   we haven't specified anything about their behaviour.

B.3  Basic Compression

   In order to achieve any compression we need to notate more knowledge
   about the header and it's behaviour in a flow.  For example, we may
   know the following facts about the header:

   1.  version number - indicates which version of the protocol this is,
       always one for this version of the protocol
   2.  type - may take any value.
   3.  flow id - may take any value.
   4.  sequence number - make take any value
   5.  flag bits - contains three flags, a, b and c, each of which may
       be set or clear, and a reserved flag bit, which is always clear
       (i.e.  zero).

   We could notate this knowledge as follows:

   eg_header ===
   {
     uc_format             =   version_no,    %[ 2 ]
                               type,          %[ 2 ]
                               flow_id,       %[ 4 ]
                               sequence_no,   %[ 4 ]
                               abc_flag_bits, %[ 3 ]
                               reserved_flag; %[ 1 ]

     co_format_basic       =   version_no,    %[ 0 ]
                               type,          %[ 2 ]
                               flow_id,       %[ 4 ]
                               sequence_no,   %[ 4 ]
                               abc_flag_bits, %[ 3 ]
                               reserved_flag  %[ 0 ]
     {
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   irregular(4);
       sequence_no         ::=   irregular(4);
       abc_flag_bits       ::=   irregular(3);
       reserved_flag       ::=   uncompressed_value(1,0);
     };
   };


Finking, et al.          Expires October 1, 2005               [Page 41]

Internet-Draft                   ROHC-FN                      March 2005


   Using this simple scheme, we have successfully encoded the fact that
   one of the fields has a permanently fixed value of one, and therefore
   contains no useful information.  We have also encoded the fact that
   the final flag bit is always zero, which again contains no useful
   information.  Both of these facts have been notated using the
   uncompressed_value encoding method (see Section 4.6.1)

   Note that we could have omitted the "0 bits" fields from the field
   order list of "co_format_basic" if we wished.  The only purpose of
   that list is to indicate the order of the fields in the compressed
   header.  Since zero bit fields don't actually appear, they can be
   omitted.

   Using this new encoding on the above header, we get:

     Uncompressed header: 0101000100010000
     Compressed header:   0100010001000

   Which reduces the amount of data we need to transmit by roughly 20%.
   However, this encoding fails to take advantage of relationships
   between values of a field in one packet and its value in subsequent
   packets.  For example, every header in the following sequence is
   compressed by the same amount despite the similarities between them:

     Uncompressed header: 0101000100010000
     Compressed header:   0100010001000


     Uncompressed header: 0101000101000000
     Compressed header:   0100010100000


     Uncompressed header: 0111000101110000
     Compressed header:   1100010111000


B.4  Inter-packet compression

   The profile we have defined so far has not compressed the sequence
   number or flow ID fields at all, since they can take any value.
   However the value of these fields in one header has a very simple
   relationship to their value in previous headers:
   o  the sequence number is unusual, it increases by three each time,
   o  the flow_id stays the same, it always has the same value that it
      did in the previous header in the flow,
   o  the abc_flag_bits stay the same most of the time, they usually
      have the same value that they did in the previous header in the
      flow,


Finking, et al.          Expires October 1, 2005               [Page 42]

Internet-Draft                   ROHC-FN                      March 2005


   An obvious way of notating this is as follows:

   % This obvious encoding will not work (correct encoding below)
   eg_header  ===
   {
     uc_format             =   version_no,    %[ 2 ]
                               type,          %[ 2 ]
                               flow_id,       %[ 4 ]
                               sequence_no,   %[ 4 ]
                               abc_flag_bits, %[ 3 ]
                               reserved_flag; %[ 1 ]

     co_format_obvious     =   type,          %[ 2 ]
                               abc_flag_bits  %[ 3 ]
     {
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   static;
       sequence_no         ::=   lsb(0,-3);
       abc_flag_bits       ::=   irregular(3);
       reserved_flag       ::=   uncompressed_value(1,0);
     };
   };

   The dependency on previous packets is notated using the static and
   LSB encoding methods (see Section 4.6.4 and Section 4.6.5
   respectively).

   However there are a few problems with the above notation.  Firstly,
   and most importantly, the flow_id field is notated as "static" which
   means that it doesn't change from packet to packet.  However, the
   notation does not indicate how to communicate the value of the field
   initially.  It's all very well saying "it's the same value as last
   time", but there must have been a first time where we define what
   that value is, so that it can be referred back to.  The above
   notation provides no way of communicating that.  Similarly with the
   sequence number - there needs to be a way of communicating its
   initial value.

   Secondly, the sequence number field is communicated very efficiently
   in zero bits, but it is not at all robust against packet loss.  If a
   packet is lost then there is no way to handle the missing sequence
   number.

   Finally, although the flag bits are usually the same as in the
   previous header in the flow, the profile doesn't make any use of this
   fact; since they are sometimes not the same as those in the previous
   header, it is not safe to say that they are always the same, so


Finking, et al.          Expires October 1, 2005               [Page 43]

Internet-Draft                   ROHC-FN                      March 2005


   static encoding can't be used exclusively.  We solve all three of
   these problems below, robustness first since it is simplest, and the
   remainder in the following section.

   When communicating sequence numbers, or any other field encoding with
   LSB encoding, a very important consideration for the notator is how
   robust against packet loss the compressed protocol should be.  This
   will vary a lot from protocol stack to protocol stack.  For example
   RTP has a high setup cost, so the compressed stream needs to be
   robust against fairly high packet loss.  Things are different with
   TCP, where robustness to loss of just a few packets is sufficient.
   For the example protocol we'll assume short, low overhead flows and
   say we need to be robust to the loss of just one packet, which we can
   achieve with two bits of LSB encoding (one bit isn't enough since the
   sequence number increases by three each time - see Section 4.6.5 ).

B.5  Multiple Packet Formats

   To communicate initial values for the sequence number and flow ID
   fields, and to take advantage of the fact that the flag bits are
   usually the same as in the previous header, we need to depart from
   the single packet format encoding we are currently using and instead
   use multiple packet formats:


Finking, et al.          Expires October 1, 2005               [Page 44]

Internet-Draft                   ROHC-FN                      March 2005


   eg_header  ===
   {
     uc_format             =   version_no,    %[ 2 ]
                               type,          %[ 2 ]
                               flow_id,       %[ 4 ]
                               sequence_no,   %[ 4 ]
                               abc_flag_bits, %[ 3 ]
                               reserved_flag; %[ 1 ]

     co_format_irregular   =   discriminator,     %[ 1 ]
                               type,              %[ 2 ]
                               flow_id,           %[ 4 ]
                               sequence_no,       %[ 4 ]
                               abc_flag_bits      %[ 3 ]
     {
       discriminator       ::=   '0';
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   irregular(4);
       sequence_no         ::=   irregular(4);
       abc_flag_bits       ::=   irregular(3);
       reserved_flag       ::=   uncompressed_value(1,0);
     };

     co_format_compressed  =   discriminator,       %[ 1 ]
                               type,                %[ 2 ]
                               sequence_no          %[ 2 ]
     {
       discriminator       ::=   '1';
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   static;
       sequence_no         ::=   lsb(2,-3);
       abc_flag_bits       ::=   static;
       reserved_flag       ::=   uncompressed_value(1,0);
     };
   };

   Note that we have had to add a discriminator field, so that the
   decompressor can tell which packet format has been used by the
   compressor.  The format with a static flow ID and LSB encoded
   sequence number, is now 5 bits long, a saving of over 60% on the size
   of the single packet format, almost a 70% saving on the size of the
   uncompressed header.  Note that despite having to add the
   discriminator field, this format is still the same size as the
   original incorrect naive notation, because this notation takes
   advantage of the fact that the abc flag bits rarely change.


Finking, et al.          Expires October 1, 2005               [Page 45]

Internet-Draft                   ROHC-FN                      March 2005


   However, the original packet format (with an irregular flow ID and
   sequence number) has also grown by one bit due to the addition of the
   discriminator.  An important consideration when creating multiple
   packet formats is whether each format occurs frequently enough that
   the average compressed header length is shorter as a result of its
   usage.  For example, if in fact the flag bits always changed between
   packet headers, the static encoding could never be used; all we would
   have achieved is to lengthen the irregular packet format by one bit.

   Using the above notation, we now get:

     Uncompressed header: 0101000100010000
     Compressed header:   00100010001000


     Uncompressed header: 0101000101000000
     Compressed header:   10100 ; 00100010100000


     Uncompressed header: 0111000101110000
     Compressed header:   11100 ; 01100010111000

   The first header in the stream is compressed the same way as before,
   except that it now has the extra 1 bit discriminator at the start
   (0).  When a second header arrives, with the same flow ID as the
   first and its sequence number three higher, it can now be compressed
   in two possible ways, either using "co_format_compressed" or in the
   same way as previously, using "co_format_irregular".

   Note that we show all possible encodings of a packet as defined by a
   given profile, separated by semi-colons.  Either of the above
   encodings for the packet could be produced by a valid implementation,
   although a good implementation would always aim to make the
   compressed size as small as possible and an optimum implementation
   would pick the encoding which led to the best compression of the
   packet stream (which is not necessarily the smallest encoding for a
   particular packet).

B.6  Variable Length Discriminators

   Suppose we do some analysis on flows of our example protocol and
   discover that whilst it is usual for successive packets to have the
   same flags, on the occasions when they don't, the packet is almost
   always a "flags set" packet, in which all three of the abc flags are
   set.  To encode the flow more efficiently a packet format needs to be
   written to reflect this.

   This now gives a total of three packet formats, which means we need


Finking, et al.          Expires October 1, 2005               [Page 46]

Internet-Draft                   ROHC-FN                      March 2005


   three discriminators to differentiate between them.  The obvious
   solution here is to increase the number of bits in the discriminator
   from one to two and for example use discriminators 00, 01, and 10.
   However we can do slightly better than this.

   Any uniquely identifiable discriminator will suffice, so we can use
   00, 01 and 1.  If the discriminator starts with 1, that's the whole
   thing.  If it starts with 0 the decompressor knows it has to check
   one more bit to determine the packet kind.

   Note that care must be taken when using variable length
   discriminators.  For example it would be erroneous to use 0, 01 and
   10 as discriminators since after reading an initial 0, the
   decompressor would have no way of knowing if the next bit was a
   second bit of discriminator, or the first bit of the next field in
   the packet stream.  0, 10 and 11 however would be OK as the first bit
   again indicates whether or not there are further discriminator bits
   to follow.

   This gives us the following:

   eg_header  ===
   {
     uc_format             =   version_no,    %[ 2 ]
                               type,          %[ 2 ]
                               flow_id,       %[ 4 ]
                               sequence_no,   %[ 4 ]
                               abc_flag_bits, %[ 3 ]
                               reserved_flag; %[ 1 ]

     co_format_irregular   =   discriminator,     %[ 2 ]
                               type,              %[ 2 ]
                               flow_id,           %[ 4 ]
                               sequence_no,       %[ 4 ]
                               abc_flag_bits      %[ 3 ]
     {
       discriminator       ::=   '00';
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   irregular(4);
       sequence_no         ::=   irregular(4);
       abc_flag_bits       ::=   irregular(3);
       reserved_flag       ::=   uncompressed_value(1,0);
     };

     co_format_flags_set   =   discriminator,     %[ 2 ]
                               type,              %[ 2 ]
                               sequence_no        %[ 2 ]


Finking, et al.          Expires October 1, 2005               [Page 47]

Internet-Draft                   ROHC-FN                      March 2005


     {
       discriminator       ::=   '01';
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   static;
       sequence_no         ::=   lsb(2,-3);
       abc_flag_bits       ::=   uncompressed_value(3,7);
       reserved_flag       ::=   uncompressed_value(1,0);
     };

     co_format_flags_static =  discriminator,     %[ 1 ]
                               type,              %[ 2 ]
                               sequence_no        %[ 2 ]
     {
       discriminator       ::=   '1';
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   static;
       sequence_no         ::=   lsb(2,-3);
       abc_flag_bits       ::=   static;
       reserved_flag       ::=   uncompressed_value(1,0);
     };
   };

   Here is some example output:

     Uncompressed header: 0101000100010000
     Compressed header:   000100010001000


     Uncompressed header: 0101000101000000
     Compressed header:   10100 ; 000100010100000


     Uncompressed header: 0111000101110000
     Compressed header:   11100 ; 001100010111000


     Uncompressed header: 0111000110101110
     Compressed header:   011100 ; 001100011010111

   Here we have a very similar sequence to last time, except that there
   is now an extra message on the end which has the flag bits set.  The
   encoding for the first message in the stream is now one bit larger,
   the encoding for the next two messages is the same as before, since
   that packet format has not grown, thanks to the use of variable
   length discriminators.  Finally the packet that comes through with
   all the flag bits set can be encoded in just six bits, only one bit


Finking, et al.          Expires October 1, 2005               [Page 48]

Internet-Draft                   ROHC-FN                      March 2005


   more than the most common packet format.  Without the extra packet
   format, this last packet would have to be encoded using the longest
   packet format and would have taken up 14 bits.  This represents a
   saving of almost 60% for this kind of packet.

B.7  Default encoding

   There is some redundancy in the notation used so far.  For a number
   of fields, the same encoding method is used several times in
   different formats, but the field encoding is redefined explicitly
   each time.  If the encoding for any of these fields changed in the
   future (e.g.  if the reserved flag took on some new role), then every
   packet format would have to be modified to reflect this change.

   This problem can be avoided by specifying a default encoding methods
   for these fields.  Doing so also leads to a more concisely notated
   profile:
   eg_header  ===
   {
     uc_format     =   version_no,    %[ 2 ]
                       type,          %[ 2 ]
                       flow_id,       %[ 4 ]
                       sequence_no,   %[ 4 ]
                       abc_flag_bits, %[ 3 ]
                       reserved_flag; %[ 1 ]

     default_methods =
     {
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   static;
       sequence_no         ::=   lsb(2,-3);
       reserved_flag       ::=   uncompressed_value(1,0);
     };

     co_format_irregular   =   discriminator,     %[ 2 ]
                               type,              %[ 2 ]
                               flow_id,           %[ 4 ]
                               sequence_no,       %[ 4 ]
                               abc_flag_bits      %[ 3 ]
     {
       discriminator       ::=   '00';
       flow_id             ::=   irregular(4);  % overrides default
       sequence_no         ::=   irregular(4);  % overrides default
       abc_flag_bits       ::=   irregular(3);
     };

     co_format_flags_set   =   discriminator,     %[ 2 ]


Finking, et al.          Expires October 1, 2005               [Page 49]

Internet-Draft                   ROHC-FN                      March 2005


                               type,              %[ 2 ]
                               sequence_no        %[ 2 ]
     {
       discriminator       ::=   '01';
       abc_flag_bits       ::=   uncompressed_value(3,7);
     };

     co_format_flags_static   =   discriminator,     %[ 1 ]
                                  type,              %[ 2 ]
                                  sequence_no        %[ 2 ]
     {
       discriminator       ::=   '1';
       abc_flag_bits       ::=   static;
     };
   };

   The above profile behaves in exactly the same way as the one notated
   previously, since it has the same meaning.  Note that the purposes
   behind the different formats become clearer with the default encoding
   methods factored out; all that remains are the encodings which are
   specific to each format.  Note also that default encoding methods
   which compress down to zero bits have become completely implicit.
   For example the compressed formats mention "version_no" neither in
   their field order lists (no need, it's zero bits long) nor their
   field encodings lists (no need it's specified in the default encoding
   methods).

B.8  Control Fields

   One inefficiency in the compression scheme we have produced thus far
   is that it uses two bits to provide the LSB encoded sequence number
   with robustness for the loss of just one packet.  In theory only one
   bit should be needed.  The root of the problem is the unusual
   sequence number that the protocol uses - it counts up in increments
   of three.  In order to encode it at maximum efficiency we need to
   translate this into a field that increments by one each time.  We do
   this using a control field.

   Control fields are extra data that are communicated in the compressed
   packet, which are not direct encodings of fields in the uncompressed
   header.  They can be used to communicate extra information in the
   compressed packet, which allows other fields to be compressed more
   efficiently.

   The control field which we introduce scales the sequence number down
   by a factor of three.  Instead of encoding the original sequence
   number in the compressed packet, we encode the scaled sequence
   number, allowing us to have robustness to the loss of one packet by


Finking, et al.          Expires October 1, 2005               [Page 50]

Internet-Draft                   ROHC-FN                      March 2005


   using just one bit of LSB encoding:

   eg_header  ===
   {
     uc_format     =   version_no,    %[ 2 ]
                       type,          %[ 2 ]
                       flow_id,       %[ 4 ]
                       sequence_no,   %[ 4 ]
                       abc_flag_bits, %[ 3 ]
                       reserved_flag; %[ 1 ]

     control_fields = scaled_seq_no;

     default_methods       =
     {
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   static;
       reserved_flag       ::=   uncompressed_value(1,0);

       % need modulo maths to calculate scaling correctly,
       % due to 4 bit wrap around
       let(scaled_seq_no:uncomp_value
             == ((mod(15 - sequence_no:uncomp_value, 3) * 16
                  + sequence_no:uncomp_value) / 3));
       scaled_seq_no       ::=   lsb(1,-1);
     };

     co_format_irregular   =   discriminator,     %[ 2 ]
                               type,              %[ 2 ]
                               flow_id,           %[ 4 ]
                               scaled_seq_no,     %[ 4 ]
                               abc_flag_bits      %[ 3 ]
     {
       discriminator       ::=   '00';
       flow_id             ::=   irregular(4);  % overrides default
       scaled_seq_no       ::=   irregular(4);  % overrides default
       abc_flag_bits       ::=   irregular(3);
     };

     co_format_flags_set   =   discriminator,     %[ 2 ]
                               type,              %[ 2 ]
                               scaled_seq_no      %[ 1 ]
     {
       discriminator       ::=   '01';
       abc_flag_bits       ::=   uncompressed_value(3,7);
     };


Finking, et al.          Expires October 1, 2005               [Page 51]

Internet-Draft                   ROHC-FN                      March 2005


     co_format_flags_static   =   discriminator,     %[ 1 ]
                                  type,              %[ 2 ]
                                  scaled_seq_no      %[ 1 ]
     {
       discriminator       ::=   '1';
       abc_flag_bits       ::=   static;
     };
   };

   Here is some example output:

     Uncompressed header: 0101000100010000
     Compressed header:   000100010001000


     Uncompressed header: 0101000101000000
     Compressed header:   1010 ; 000100010100000


     Uncompressed header: 0111000101110000
     Compressed header:   1110 ; 001100010111000


     Uncompressed header: 0111000110101110
     Compressed header:   01110 ; 001100011010111

   In it's final form, we see that this gives us a saving of a further
   bit in most packets, reducing the size of the most common packet by
   20%.  Assuming the bulk of the flow is made up of
   "co_format_flags_static" headers, the mean size of the headers in the
   compressed flow is just over a quarter of their size in an
   uncompressed flow.


Finking, et al.          Expires October 1, 2005               [Page 52]

Internet-Draft                   ROHC-FN                      March 2005


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Copyright Statement

   Copyright (C) The Internet Society (2005).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.


Finking, et al.          Expires October 1, 2005               [Page 53]