Network Working Group                  Richard Price, Siemens/Roke Manor 
INTERNET-DRAFT                       Abigail Surtees, Siemens/Roke Manor 
Expires: April 2003                      Mark A West, Siemens/Roke Manor 
 
                                                        October 28, 2002 
 
 
               Protocol-Enabled BNF-Based LanguagE (PEBBLE) 
                 <draft-ietf-rohc-formal-notation-00.txt> 
                                     
    
Status of this memo 
 
   This document is an Internet-Draft and is in full conformance with 
   all provisions of Section 10 of RFC2026. 
    
   Internet-Drafts are working documents of the Internet Engineering 
   Task Force (IETF), its areas, and its working groups. Note that other 
   groups may also distribute working documents as Internet-Drafts. 
    
   Internet-Drafts are draft documents valid for a maximum of six months 
   and may be updated, replaced, or obsoleted by other documents at any 
   time. It is inappropriate to use Internet-Drafts as reference 
   material or cite them other than as "work in progress". 
    
   The list of current Internet-Drafts can be accessed at 
   http://www.ietf.org/ietf/lid-abstracts.txt 
    
   The list of Internet-Draft Shadow Directories can be accessed at 
   http://www.ietf.org/shadow.html 
    
   This document is a submission of the IETF ROHC WG. Comments should be 
   directed to its mailing list, rohc@ietf.org. 
    
    
Abstract 
    
   This draft defines PEBBLE: a formal notation for describing the 
   behavior of protocols or protocol stacks. The notation is designed to 
   provide information useful in "protocol-aware" compression 
   algorithms, for example the header compression profiles compatible 
   with the Robust Header Compression (ROHC) framework. 
    
    
Price et al.                                                    [Page 1]  

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
Table of contents 
   
   1.  Introduction..................................................2 
   2.  Terminology...................................................2 
   3.  Overview of PEBBLE............................................3 
   4.  Normative definition of PEBBLE................................7 
   5.  Improving readability.........................................15 
   6.  Pre-defined rules and integer operators.......................17 
   7.  Library of commonly used rules................................22 
   8.  Security considerations.......................................34 
   9.  Acknowledgements..............................................34 
   10. Authors' addresses............................................34 
   11. References....................................................35 
   Appendix A. ABNF description of PEBBLE............................35 
    
    
1.  Introduction 
    
   This document defines PEBBLE: a version of the well-known Backus-Naur 
   Form (BNF) notation which can be used to describe the behavior of 
   protocols or protocol stacks. It is intended to simplify the creation 
   of new compression profiles to fit within the ROHC [RFC-3095] 
   framework. 
    
   The notation in itself does not define the "bits on the wire" of the 
   compressed data. To achieve this an encoding algorithm must be chosen 
   that can map the uncompressed packets onto a set of compressed 
   packets and vice versa. The specification of such an encoder is 
   beyond the scope of this draft: each ROHC profile created using 
   PEBBLE can select its own based on the requirements of the profile.  
    
    
2.  Terminology 
    
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 
   document are to be interpreted as described in RFC-2119 [RFC-2119]. 
    
   BNF (Backus Naur Form) 
    
     BNF is a notation used to describe the syntax of many well-known  
     protocols. 
    
   Control field 
    
     A control field is a field described by PEBBLE but which is not  
     part of the protocol header itself. For example, PEBBLE may define  
     a checksum field over the header to ensure robustness against bit  
     errors and dropped packets. 
    
    
Price et al.                                                    [Page 2] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   Encoding algorithm 
    
     An encoding algorithm defines a reversible mapping between a  
     set of uncompressed packets and a set of compressed packets.  
     Protocol-aware encoding algorithms can make use of the  
     information provided in the PEBBLE description of a protocol. 
    
   Field 
    
     The PEBBLE description of a protocol divides the protocol into a  
     set of contiguous bit patterns known as fields. 
    
   Label 
    
     Each field can be assigned a "label name" which allows it to be  
     referenced by different PEBBLE rules. 
    
   Library of rules 
    
     The library of rules contains a number of commonly used rules for  
     describing the behavior of a protocol. More rules can be added to  
     the library as and when they are needed. 
    
   Pre-defined rules 
    
     Four pre-defined rules are provided as standard by PEBBLE; these  
     can be combined to create new rules with more complex behavior. 
    
   Profile 
     
     A ROHC [RFC-3095] profile is a description of how to compress a  
     certain protocol stack over a certain type of link. Each profile  
     includes an encoding algorithm to compress the headers and a state  
     machine to control the actions of each endpoint. 
    
   Rule 
    
     Rules are the basis for the PEBBLE description of a protocol. Each  
     rule defines the behavior of one or more fields in the protocol  
     header. Rules can be taken from the set of pre-defined rules or can  
     be defined in terms of existing rules. 
    
    
3.  Overview of PEBBLE 
    
   This chapter gives an overview of the PEBBLE notation, including its 
   relation to existing versions of BNF such as "Augmented BNF" (ABNF) 
   [RFC-2234]. 
    
    
Price et al.                                                    [Page 3] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
3.1. What is BNF? 
    
   BNF (Backus Naur Form) is a notation used to describe the syntax of 
   many well-known protocols, for example SIP [RFC-3261] and RTSP [RFC-
   2960]. All versions of BNF are based around the concept of a "BNF 
   rule". Each BNF rule describes the syntax of a portion of the 
   protocol, and can be defined in terms of existing BNF rules which 
   makes it easy to build new descriptions out of existing ones (e.g. 
   when adding new layers to a protocol stack). The primary or "top-
   level" BNF rule describes the syntax of the entire protocol stack in 
   question. 
    
   When the BNF rule which describes the syntax of a certain protocol 
   stack is applied to a packet that may or may not contain a header 
   from the protocol stack, there are two possible outcomes: 
    
   1. The packet successfully matches the BNF rule, indicating that it  
      contains a valid header from the corresponding protocol stack. 
    
   2. The packet fails to match the BNF rule, indicating that it does  
      not contain a header from the corresponding protocol stack. 
    
   So each BNF rule can be considered to be an operator, whose input is 
   a packet and whose output is a Boolean value specifying whether the 
   packet contains a header from the chosen protocol stack. 
    
   In order to create new BNF rules from existing rules, it is necessary 
   to provide at least one pre-defined BNF rule as part of the notation. 
   Different variants of BNF offer different sets of pre-defined BNF 
   rules from which new rules can be constructed. For example, the 
   variant of BNF known as "Augmented BNF" [RFC-2234] includes the 
   following pre-defined rules: 
    
   rule_1 rule_2 ... rule_n               Concatenation of n BNF rules 
    
   rule_1 | rule_2 | ... | rule_n         Choice between n BNF rules 
    
   "%d" m-n                               Character with ASCII code  
                                          between m and n inclusive 
    
   [rule]                                 Optional BNF rule 
    
   x*y rule                               List of between x and y  
                                          occurrences of a BNF rule 
    
   Note that each of the pre-defined BNF rules have operands that can 
   affect the outcome of the rule. For example, "x*y rule" matches n 
   consecutive instances of an existing BNF rule where n lies between 
   the integers x and y inclusive. So if the BNF rule "foo" matches the 
   bit patterns "00" or "11" then the BNF rule 1*2(foo) will match the 
   bit patterns "00", "11", "0000", "0011", "1100" or "1111". 

 
Price et al.                                                    [Page 4] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   New BNF rules can be created by assigning a name for the rule, 
   followed by the symbol "::=" and an expression which evaluates to the 
   new rule and is written in terms of existing rules. For example: 
    
   bar     ::=   1*2(foo) 
    
   The following BNF description illustrates this process in more 
   detail. Using the pre-defined BNF rules from ABNF [RFC-2234], a new 
   BNF rule "list" is created which describes a list of arbitrary 
   positive or negative integers: 
    
   list    ::=   1*(number) 
    
   number  ::=   ["+" | "-"] 1*(digit) 
    
   digit   ::=   %d48-57 
    
3.2. What is PEBBLE? 
    
   PEBBLE is a version of BNF designed to be useful when creating 
   "protocol-aware" compression algorithms. Knowledge of the protocol to 
   be compressed typically improves the compression ratio and reduces 
   the processing and memory overhead compared to the generic "protocol-
   unaware" case. 
    
   Note that PEBBLE describes the behavior of the uncompressed headers 
   only, so descriptions can be written without requiring any knowledge 
   of the algorithm that will generate the actual compressed data. In 
   particular the notation is designed to be both human-readable and 
   machine-readable. If only one protocol stack needs to be compressed, 
   a suitable encoder can be designed by hand based on the PEBBLE 
   description of the header. Conversely, it is possible to implement a 
   protocol-independent encoder that can take an arbitrary PEBBLE 
   description and automatically generate a set of compressed headers 
   for the relevant protocol. 
    
   The PEBBLE notation is a version of BNF with four pre-defined rules: 
   FIELD, DISTRIBUTION, MAP and CONCATENATION (see Chapter 6 for a 
   detailed description of these rules). PEBBLE also includes the 
   following unique features not found in simpler versions of BNF: 
    
   1. As well as describing the set of valid headers for the chosen  
      protocol, PEBBLE also outputs some additional information that may  
      be useful to encoders. 
    
   2. New rules can be created with operands that modify the outcome of  
      the rule. 
    
   Each of these features is discussed in more detail in the following 
   sections. 
    

Price et al.                                                    [Page 5] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
3.3. Useful information for encoders 
    
   The rules defined by PEBBLE behave in the same manner as ordinary BNF 
   rules: the input is a packet (and optionally some additional operands 
   that affect the outcome of the rule), and the output is a Boolean 
   value specifying whether the packet contains a valid header from the 
   chosen protocol stack. 
    
   However the rules provided by PEBBLE also output some additional 
   information that may be useful to encoders. As an example, for each 
   header that matches a certain PEBBLE rule, the rule also outputs the 
   probability that the header will occur in a typical data flow. This 
   extra information allows an encoding algorithm to encode more common 
   headers using fewer bits than headers expected to occur rarely. 
    
   The following diagram shows the inputs and outputs for a PEBBLE rule: 
    
    +-----------------------+                 +-----------------------+ 
    |  packet to be tested  |   PEBBLE rule   | valid/invalid header? | 
    +-----------------------+ --------------> +-----------------------+ 
    | operands for the rule |                 |  useful information   | 
    +-----------------------+                 +-----------------------+ 
    
              Figure 1: Inputs and outputs for a PEBBLE rule 
    
   More generally, the aim of the PEBBLE notation is to provide 
   sufficient information to support a wide variety of encoders. 
   Particular encoders are of course at liberty to ignore this 
   information if they wish. If it is not clear whether or not a piece 
   of information will be useful, it is generally considered that it 
   should be included since it is easier to ignore unnecessary 
   information than to infer missing information. 
    
3.4. Creating new rules in PEBBLE 
    
   All versions of BNF allow new rules to be defined in terms of 
   existing rules. However, most variants of BNF only allow simple rules 
   with no operands to be defined in terms of existing rules; BNF rules 
   with one or more operands must be pre-defined by hand. 
    
   Since the PEBBLE notation provides a large library of rules for 
   describing new protocols, it is useful to extend the mechanism for 
   creating new rules so that it can handle rules with operands. As a 
   result the description of rules in the library is greatly simplified. 
    
   For example, suppose that a rule RANGE(#n,#v,#k) has already been 
   defined which describes a single n-bit field that can take values 
   from v to v+k-1 inclusive. 
    
   The RANGE rule can be used to create a new rule called VALUE with two 
   integer operands n and v: 

 
Price et al.                                                    [Page 6] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   VALUE(#n,#v)   ::=   RANGE(n,v,1) 
    
   The VALUE rule describes an n-bit field that always takes the value 
   v. In turn, this rule is useful for describing fields such as the IP 
   Version: 
    
   IPv4_Version   ::=   VALUE(4,4) 
    
   The mechanism for defining new rules in terms of existing rules forms 
   the basis of PEBBLE, and is discussed in detail in the following 
   chapter. 
    
    
4.  Normative definition of PEBBLE 
    
   This chapter defines the PEBBLE notation for describing the behavior 
   of protocols or protocol stacks. 
    
4.1. Inputs and outputs of a rule 
    
   As explained in Section 3.1, a rule can be considered to be an 
   operator which takes a packet as its input and returns a Boolean 
   value specifying whether or not the packet contains a valid header. 
   In PEBBLE each rule can also take one or more additional operands 
   that modify its behavior, and can return extra information that may 
   be of use to certain encoders. 
    
   As well as the operands declared explicitly when the rule is created, 
   there are three "implicit operands" common to all rules. Implicit 
   operands do not need to be declared explicitly because they are 
   always required by a rule. The first implicit operand is the packet 
   to be tested for a valid header; the other two implicit operands (the 
   context and the list of labels) are described in Section 4.4 and 
   Section 4.5 respectively. 
    
   As well as the Boolean value indicating whether the packet contains a 
   valid header or not, each rule provides three outputs that may be 
   useful to encoders. The first is the probability that the header will 
   occur in a typical flow, which is discussed in greater detail in 
   Section 4.3. The other two outputs (an updated context and an updated 
   list of labels) are described in Section 4.4 and Section 4.5 
   respectively. Note that if the packet does not contain a valid header 
   then the remaining three outputs are not provided. 
    
   In summary, a complete list of the inputs and outputs for each rule 
   is given below: 
    
   Inputs:   Packet to be tested for a valid header 
             Operands explicitly declared by the rule 
             Length and value for fields in the context 
             List of labels and their values 

 
Price et al.                                                    [Page 7] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   Outputs:  Boolean value stating if the packet contains a valid header 
             Probability value that header will occur in a typical flow 
             Updated length and value for fields in the context 
             Updated list of labels and their values 
    
4.2. Creating new PEBBLE rules 
    
   In PEBBLE the behavior of a protocol is defined by a single "primary" 
   rule, which may itself be defined in terms of simpler rules. For 
   example: 
    
   IPv6_Header   ::=   Version 
                       Traffic_Class 
                       ECT_Flag 
                       CE_Flag 
                       Flow_Label 
                       Payload_Length 
                       Next_Header 
                       Hop_Limit 
                       Source_Address 
                       Destination_Address 
    
   The general format for creating a new rule in terms of existing rules 
   is as follows: 
    
              new_rule whitespace "::=" whitespace expression 
    
   The term "new_rule" describes the name of the new rule together with 
   the names of any operands needed by the rule. The term "expression" 
   refers to an expression which evaluates to the desired behavior of 
   the new rule. The expression can include one or more existing rules 
   and integer operators, as well as any of the operands declared for 
   the new rule. 
    
   For example, using the RANGE rule discussed previously it is possible 
   to create a new IRREGULAR rule that describes a single n-bit field 
   capable of taking all values from 0 to 2^n-1 inclusive: 
    
   IRREGULAR(#n)   ::=   RANGE(n,0,2^n) 
    
   The syntax of each of these terms is described in detail in the 
   following sections. 
    
4.2.1. Syntax of rule names and operands 
    
   When creating a new rule in PEBBLE it is necessary to give a name for 
   the new rule, and to declare any operands that are required by the 
   rule. 
    

Price et al.                                                    [Page 8] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   The name of the new rule can be any combination of letters, digits or 
   the symbol "_", provided that it doesn't coincide with the name of an 
   existing rule or integer operator. Names are case-sensitive. 
    
   The name of the new rule is followed by a list of the operands 
   expected by the rule (enclosed in parentheses and separated by 
   commas). A name is assigned to each operand so that it can be used in 
   the expression which defines the new rule (see Section 4.2.2 for more 
   on this). Operand names are also a combination of letters, digits and 
   the symbol "_". Operand names within the definition of a rule should 
   be unique, and should not overlap with the names of any existing 
   rules or integer operators. However, the operand names from one rule 
   can be reused when defining a different rule. 
    
   Three types of operand are available: rules, text names and (signed) 
   integers. Text name operands are preceded by a "$" whilst integer 
   operands are preceded by a "#". If no symbol precedes the operand 
   then it is assumed to be a rule operand. For example: 
    
   RANGE(#n,#v,#k) 
   IRREGULAR(#n) 
   INFERRED_OFFSET(#n,$base) 
   LOCAL($label,rule) 
    
   It is also possible to declare a list of operands by appending a pair 
   of square braces to the operand name. If the square braces enclose 
   the name of a previously declared integer operand k then the list of 
   operands must contain entries from 0 to k-1 inclusive. 
    
   For example, RULE(#n,#list[n]) ::= ... creates a new rule which 
   requires a single integer operand n followed by n integer operands 
   list[0] to list[n-1]. 
    
4.2.2. Syntax of expressions 
    
   The behavior of a new rule is defined by giving an expression in 
   terms of one or more existing rules and integer operators. For 
   example, using the IRREGULAR rule defined above it is possible to 
   create the more complex UNCOMPRESSIBLE rule as follows: 
    
   UNCOMPRESSIBLE($label,#d,#m,#p)   ::=   IRREGULAR(floor(label,d)*m+p) 
    
   If the overall expression includes one or more existing rules then 
   values must be given for any operands needed by the rules. These 
   values can in turn be defined using expressions. For example, when 
   defining the UNCOMPRESSIBLE rule in terms of the IRREGULAR rule the 
   expression "floor(label,d)*m+p" gives the value of the operand n 
   needed by the IRREGULAR rule. 
    
   Expressions can include any of the operands that have been declared 
   for the new rule. If the operand is a list then it must be followed 

 
Price et al.                                                    [Page 9] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   by an integer expression (enclosed in square braces) that specifies 
   which value from the list is to be used. 
    
   For example, the following BRANCH rule applies one of n existing 
   rules depending on the value contained in the operand "indicator": 
    
   BRANCH(#indicator,#n,rule[n])   ::=   rule[indicator] 
    
   The set of integer operators available to PEBBLE is described in 
   Section 6.1, and the set of four pre-defined rules is described in 
   Section 6.2. 
    
   For simplicity, PEBBLE does not currently provide any operators which 
   generate new text names from existing ones. The only way to specify a 
   text name is to write the name explicitly or to use a text name 
   operand. 
    
4.2.3. Generating lists of expressions 
    
   In some cases it is useful to specify a list of one or more values in 
   a single step. This can be accomplished by specifying an expression 
   in terms of one or more indices, which are then declared together 
   with their range. The list of values is generated by allowing each of 
   the indices to take every value in the specified range. The 
   expression for generating a list of values is enclosed in square 
   braces, and the indices are declared using the keywords "for" and 
   "in" as illustrated below: 
    
   [2*j for j in 0..5] 
   [rule[j] for j in 0..4] 
   [k-j for j in 0..3, k in 0..3] 
    
   The lower bound for the range is inclusive whereas the upper bound is 
   exclusive, so the expression "for j in 0..5" declares that j takes 
   the values 0, 1, 2, 3 and 4.  
    
   Suppose that the n indices are j[0],...,j[n-1]. Note that the range 
   of the ith index j[i] can be defined in terms of any of the indices 
   j[0] to j[i-1]. For example: 
    
   [k for j in 0..4, k in 0..j] 
    
   The expression to be evaluated can include any of the n indices, so 
   it can be considered to be a function of j[0],...,j[n-1]. Denote the 
   output of this function by list_item[j[0],...,j[n-1]]. 
    
   The function is then evaluated by replacing each index j[i] with all 
   of the integers within the range of the index. The final output of 
   the list expression is equivalent to writing a list of these values 
   separated by commas. 
    

Price et al.                                                   [Page 10] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   To define the order in which each item appears in the list, it is 
   specified that list_item[j[0],...,j[n-1]] precedes 
   list_item[k[0],...,k[n-1]] whenever there is an integer i such that 
   j[0] = k[0], j[1] = k[1] up to j[i-1] = k[i-1], and j[i] < k[i]. 
    
   If any of the indices is given a range containing 0 values then the 
   resulting list must be empty. 
    
   Some example lists of expressions are given below, using both the 
   square brace form and an expanded form that evaluates to give the 
   same result: 
    
   Square brace form:                   Equivalent expanded form: 
    
   [2*j for j in 0..5]                  0,2,4,6,8 
   [rule[j] for j in 0..4]              rule[0],rule[1],rule[2],rule[3] 
   [k-j for j in 0..3, k in 0..3]       0,1,2,-1,0,1,-2,-1,0 
   [k for j in 0..4, k in 0..j]         0,0,1,0,1,2 
    
4.3. Probability distribution and robustness 
    
   One of the pieces of information provided by PEBBLE is a probability 
   distribution for the headers in the chosen protocol. For a given 
   protocol header, the PEBBLE rule describing the protocol outputs the 
   probability that the header will occur in a typical flow. This 
   information can be useful to encoders, as more common headers can be 
   encoded using fewer bits than headers expected to occur rarely. 
    
   Note that the choice of how to partition a set of headers into flows 
   must be made when writing the description of the protocol, and the 
   resulting description will reflect this choice. For example, if flow 
   classification is expected to be performed using IP addresses and 
   ports then these fields will take a fixed value with probability 100% 
   in the PEBBLE description of the protocol. 
    
   The probability distribution should take into account not only the 
   behavior of the headers themselves but also the characteristics of 
   the link over which they are transmitted. For example, if the link 
   can suffer from dropped or misordered packets then the behavior of 
   certain header fields may change as a result. In particular a 
   sequence number field that increases monotonically over a reliable 
   link may occasionally jump forwards in the presence of packet drops. 
    
   In general a wide range of possible error conditions may be 
   introduced by the link: dropped, misordered or duplicated packets, 
   isolated bit errors, fades etc. From the perspective of PEBBLE 
   however, all error conditions can be classed as either "expected 
   errors" or "unexpected errors". 
    
   An expected error occurs regularly in the header flow, and hence is 
   taken into account in the PEBBLE description of the protocol. As a 

 
Price et al.                                                   [Page 11] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   result, an encoding algorithm based on the protocol description will 
   be robust against the error (i.e. headers subject to the error can 
   still be successfully decoded). ROHC profiles typically class packet 
   drops or misordering up to a certain limit as expected errors. 
    
   An unexpected error only occurs for a negligible proportion of 
   headers in the flow, and so does not need to be taken into account 
   when describing the behavior of the protocol. As a result, an 
   encoding algorithm based on the protocol description may fail to 
   decode any headers subject to an unexpected error. ROHC profiles 
   typically class bit errors and excessive packet drops or misordering 
   as unexpected errors. 
    
4.4. Context values 
    
   The headers generated by a typical protocol stack will change over 
   time, so the behavior of one header often depends on the behavior of 
   previous headers output by the stack. For example, the RTP Sequence 
   Number field increases by 1 for each consecutive header in an RTP 
   stream. 
    
   PEBBLE takes into account the dependency between successive headers 
   by storing a "context" of field values from each header. These values 
   can then be used when describing the next header in the flow. The 
   context is accessed using the integer operators context_length(#n) 
   and context_value(#n), which return the length and value of the nth 
   field in the context. 
    
   Rules can also update the context by defining updated_length(#n) and 
   updated_value(#n), which specify the updated length and value of the 
   nth field in the context. These updated lists of values are used as 
   the context for the next header in the flow. Updated values for the 
   context are defined as part of the FIELD rule (see Chapter 6 for 
   further details). 
    
   Note that as discussed in Section 4.3, when a header flow contains an 
   unexpected error then it is acceptable for an encoder to fail to 
   reconstruct the relevant header. However, it is not acceptable for 
   the decoding of subsequent valid headers to be adversely affected. In 
   particular, it is necessary to identify the headers that fail to 
   successfully decompress due to an unexpected error so that corruption 
   of the context can be prevented. 
    
   As part of the description of a header, PEBBLE can define the length 
   of a checksum that should be provided over the original uncompressed 
   header in order to detect the occurrence of unexpected errors and 
   prevent them from mistakenly being used to update the context. This 
   situation is illustrated in Figure 2: 
    
    
Price et al.                                                   [Page 12] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
                                      Checksum failure 
    +--------------+ +--------------+ ================ +--------------+ 
   -| Valid header |-| Valid header |-|Invalid header|-| Valid header |- 
    +--------------+ +--------------+ ================ +--------------+ 
          |  |             |  |                              |  | 
   >------+  +------>------+  +--------------->--------------+  +------> 
                 context                   context 
    
         Figure 2: Preventing accidental corruption of the context 
    
   The process for creating the checksum is described in greater detail 
   in Section 4.6.  
    
4.5. Labels 
    
   When describing the behavior of a protocol stack it is often the case 
   that two fields in separate parts of the header have related 
   behavior. For example, the Protocol field in an IP header is related 
   to the type of header that follows. 
    
   A standard version of BNF can capture this dependency by creating a 
   new BNF rule to describe each possible case. As an example, suppose 
   that a 1-bit flag field is followed by some intermediate fields that 
   behave independently from the flag, and then by some fields whose 
   behavior changes depending on whether the flag field takes value 0 or 
   value 1: 
    
   header   ::=   case_0 | case_1 
    
   case_0   ::=   flag_is_0 
                  intermediate_fields 
                  dependent_behavior_0 
    
   case_1   ::=   flag_is_1 
                  intermediate_fields 
                  dependent_behavior_1 
    
   The drawback of this technique is that a new BNF rule must be created 
   to capture each dependent behavior, which can result in extremely 
   complex BNF descriptions when the number of dependencies is high.
    
   An alternative technique for capturing the above field dependency 
   would be to label the flag field with a unique name, and then 
   reference this name when defining the behavior of the dependent 
   fields. For example: 
    
   header   ::=   flag_field              ; store field value as "flag" 
                  intermediate_fields 
                  dependent_fields        ; define as function of "flag" 
    

Price et al.                                                   [Page 13] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   PEBBLE allows field values to be stored for later reference by means 
   of a label. A label name can be any combination of letters, digits or 
   the symbol "_", provided that it doesn't coincide with the name of a 
   rule, integer operator, or any of the operand names used when 
   defining new rules. 
    
   Each label can contain an integer value or can be undefined. When a 
   rule is invoked it is passed a list of the labels which are currently 
   defined together with their integer values. This list is treated as 
   an implicit operand for the rule: it can alter the behavior of the 
   rule, but since the list of labels is always supplied it doesn't need 
   to be included when the operands for the rule are declared. 
    
   Each rule also returns a list of labels together with their integer 
   values. The input and output lists may be different (in particular 
   the output list may include integer values for labels that were 
   undefined in the input list, or vice versa). 
    
   Label names can be used in integer expressions, in which case they 
   are replaced by the input label value for the rule using the integer 
   expression as one of its operands. 
    
4.6. Control fields 
    
   A control field is a field whose value is created by an encoder and 
   not taken from the uncompressed protocol header itself. New control 
   fields are defined in PEBBLE by means of three well-known labels: 
   choice, msn and checksum. If these labels occur in an integer 
   expression but their values are undefined, the encoder chooses a 
   value for the label as follows: 
    
   The "choice" label is assigned a value chosen by the particular 
   implementation of an encoder. Different implementations may choose 
   different values for the control field, allowing a certain amount of 
   implementation flexibility when programming an encoder that makes use 
   of the PEBBLE notation. For example, an implementation that wishes to 
   obtain the highest possible compression efficiency may choose to try 
   several different values for the control field (seeking to maximize 
   the overall compression ratio for the header flow), whereas an 
   implementation that wishes to reduce its processing requirements may 
   use a simpler mechanism for choosing the value of the control field. 
    
   The "msn" label is assigned the Master Sequence Number (MSN). The MSN 
   is a counter which returns the position of the header in the flow: if 
   the kth header in the flow is currently being described then the msn 
   label is assigned the value k-1. 
    
   The "checksum" label is assigned a checksum over the header to 
   prevent unexpected errors from corrupting the context. A sufficiently 
   long checksum should be provided to ensure the probability that an 
   unexpected error will not be spotted is negligible. The algorithm for 

 
Price et al.                                                   [Page 14] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   calculating the checksum is chosen by the encoder and hence is beyond 
   the scope of PEBBLE. 
    
   Note that once the three labels have been assigned values they behave 
   in exactly the same manner as other labels. In particular the encoder 
   cannot change the value for the label until a rule sets the label to 
   be undefined. 
    
    
5.  Improving readability 
    
   The following chapter defines some extensions to the PEBBLE notation 
   that are designed to offer more convenient alternatives for existing 
   constructs. Whilst the extensions do not affect the expressive power 
   of PEBBLE, they are considered useful for improving its overall 
   readability. 
    
5.1. Shorthand form of operators 
    
   The standard form of the rules and integer operators can be 
   cumbersome (particularly for simple operators), so it is possible to 
   define operators using a special shorthand form. 
    
   In the shorthand form of an operator the operands are given without 
   parentheses or commas, but are instead separated by a well-known 
   symbol such as "+", "-", "<", "|", <whitespace> etc. For example: 
    
       CONCATENATION(foo,bar) can be rewritten as foo whitespace bar 
    
   Using operators in shorthand form can lead to ambiguity because the 
   operands are not necessarily enclosed in parentheses. For example the 
   expression a+b*c could be interpreted as a+(b*c) or as (a+b)*c. To 
   resolve this ambiguity it is necessary to define a precedence order 
   for operators in shorthand form. The operator with the highest 
   precedence takes the shortest possible expressions as its operands. 
   The operator with the second-highest precedence takes the shortest 
   possible expressions given that the first criterion holds, and so on. 
   Since the operator "*" has a higher precedence than "+", the 
   expression a+b*c must be interpreted as a+(b*c). 
    
   Given two operators with the same precedence, the leftmost operator 
   takes the shorter expressions as its operands. So 2^3^4 is 
   interpreted as (2^3)^4. 
    
   The operators that can be expressed in shorthand form are defined in 
   Chapter 6 together with their precedence. 
    
5.2. Lists of undefined size 
    
   For notational convenience it is possible to specify a list of 
   operands by appending an empty pair of square braces to the operand 

 
Price et al.                                                   [Page 15] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   name. In this case the number of entries in the list is not 
   specified. Instead, when the rule is encountered in an expression the 
   number of entries in the list is derived by counting the total number 
   of operands that follow the rule. For example, the operands for the 
   CONCATENATION rule are declared as follows: 
    
   CONCATENATION(rule[]) 
    
   If the term CONCATENATION(foo,bar) is encountered in an expression 
   then the list rule[] must have two entries, with rule[0] = foo and 
   rule[1] = bar. 
    
   If more than one list of operands is created with an unspecified size 
   then the size of each list must be equal. For example: 
    
   INFERRED_TRANSLATE(#n,#a[],#b[]) 
    
   If the term INFERRED_TRANSLATE(1,2,3,4,5) is encountered in an 
   expression then each of the lists a[] and b[] must contain two items, 
   where a[0] = 2, a[1] = 3, b[0] = 4 and b[1] = 5. If 
   INFERRED_TRANSLATE has an even number of operands then the rule must 
   fail. 
    
   If a new rule declares a list of undefined size then the well-known 
   operand "end" is provided to access the length of the list in 
   expressions. For example: 
    
   PICK(rule[],#p[])   ::=   CHOICE(field_value) 
                             LOCAL(field_value,rule[field_value]) 
                             DISTRIBUTION([p[j] for j in 0..end]) 
    
   In order to avoid forward referencing, all lists of operands with 
   undefined size must be declared after the single operands and the 
   lists of operands with known size. So defining a rule 
   INFERRED_TRANSLATE(#a[],#b[],#n) would be incorrect. 
    
5.3. Local definitions 
    
   To improve the readability of the notation, the definition of a new 
   rule can be given in terms of one or more "local definitions". For 
   example: 
    
   UNCOMPRESSIBLE($label,#d,#m,#p)   ::=   IRREGULAR(field_length) 
    
   where field_length is (floor(label,d)*m+p) 
    
   The keyword "where" indicates that a list of local definitions 
   follows the description of a new rule. Each local definition is given 
   a name, which can be any combination of letters, digits or the symbol 
   "_" provided that it doesn't coincide with the name of a rule, 
   integer operator, label, or any of the operands or local definitions 

 
Price et al.                                                   [Page 16] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   already specified for the rule. Note however that the names of local 
   definitions from one rule can be reused when making local definitions 
   for a different rule. 
    
   The name of each local definition is followed by the keyword "is" and 
   then by an arbitrary expression. The expression can include any of 
   the operands for the new rule and any labels defined for the rule. 
   Note that if the expression contains labels then its value may change 
   depending on where it occurs in the description of the new rule. In 
   particular, if a local definition appears several times when creating 
   a new rule then its value should be recalculated each time it occurs. 
    
    
6.  Pre-defined rules and integer operators 
    
   The following chapter defines a number of integer operators and rules 
   that can be used when creating new PEBBLE rules as per Chapter 4. 
    
6.1. Integer operators 
    
   This section provides a set of integer operators for use when 
   creating new rules. The following operators are currently available: 
    
   Integers              Integers can be expressed as decimal values,  
                         binary values prefixed by 0b, or hex values  
                         prefixed by 0x. Negative integers are prefixed  
                         by a "-" sign. 
    
   Label names           Label names can be used in integer expressions,  
                         where they are replaced by the input label 
                         value for the rule which called the integer 
                         expression. 
    
   NULL                  The keyword "NULL" can be used to specify an 
                         undefined integer value. This is useful for 
                         deleting the value contained within a label. 
    
   Parentheses           Parentheses can be used to enclose expressions,  
                         which can be useful in conjunction with 
                         operators in shorthand form. 
    
   Arithmetic            The operators +, -, *, / and ^ can be used.  
                         Note that k/v is undefined if it does not  
                         evaluate to an integer. 
    
   Percentages           A value can be expressed as a percentage with  
                         up to 2 decimal places (i.e. abc.xy%). This is 
                         interpreted as an integer from 0 to 10000 
                         inclusive (v% is interpreted as the integer 
                         v*100, so in particular 0% is interpreted as 0 
                         and 100% is interpreted as 10000). 

 
Price et al.                                                   [Page 17] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   Inequalities          Inequalities (<, =<, ==, !=, >=, >) return 1 if  
                         they are true and 0 if they are false. 
    
   floor(#k,#v)          Returns the largest integer not larger than k/v 
                         (undefined when v = 0). 
    
   mod(#k,#v)            Returns k-v*floor(k,v). 
    
   log2(#v)              Returns the smallest integer k where v =< 2^k. 
    
   min(#k[])             Returns j such that k[j] is minimal (taking the 
                         smallest value of j in the event of a tie). 
    
   checksum16(#n)        Returns the integer value of an IP checksum  
                         over the last n bits in the packet. 
    
   switch(#j,#k[])       Returns k[j]. 
    
   byte_swap(#n,#k)      Returns the byte-swapped value of k when  
                         treated as an n-bit field (undefined if n is 
                         not a multiple of 8). 
    
   permutation(#n,#k,#x) Checks if the integer x is a concatenation of  
                         the integers 0 to n-1 (each interpreted as a  
                         bit pattern of length k) in any order. Returns  
                         1 if this is the case and 0 otherwise. 
    
   context_length(#n)    Returns the length of the nth field in the  
                         context. 
                          
   context_value(#n)     Returns the value of the nth field in the  
                         context. 
    
   Note that if any of the operands are undefined, the value returned by 
   the operator is also undefined. 
    
   The precedence of each operator in shorthand form is given below 
   (higher precedence first): 
    
   Integers, percentages 
   ^ 
   *, / 
   +, - 
   <, =<, ==, !=, >=, > 
    
6.2. Pre-defined rules 
    
   The following sections describe the four pre-defined rules provided 
   by PEBBLE. 
    
    
Price et al.                                                   [Page 18] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
6.2.1. FIELD 
    
   The FIELD rule indicates the existence of a new field to be described 
   by the PEBBLE notation. Note that the actual behaviour of the field 
   is defined separately (for example by using the DISTRIBUTION rule). 
    
   The syntax of the FIELD rule is given below: 
    
   FIELD(#n) 
    
   The operand n specifies the number of bits in the field. The FIELD 
   rule uses the well-known label "remaining_data" to determine the 
   position of the field in the packet. The remaining_data label 
   specifies the start of the field as an bit offset relative to the end 
   of the packet (so if remaining_data is 16 then the rule FIELD(16) 
   refers to the last 16 bits in the packet), as illustrated in Figure 
   3. The integer value of the field can then be determined, taking MSBs 
   to precede LSBs. If n or remaining_data are undefined, or if 
   remaining_data < n, then the FIELD rule must fail. 
    
                         <-------------- remaining_data ---------------> 
   +---------------------+-------------+-------------------------------+ 
   |                     | n-bit field |                               | 
   +---------------------+-------------+-------------------------------+ 
    
   <----------------------- header and payload ------------------------> 
    
      Figure 3: The location of the n-bit field described by FIELD(n)  
    
   If n = 0 then the integer value of the field is always taken to be 0.  
    
   If the FIELD rule successfully locates a field, in the list of labels 
   output by the FIELD rule the value of remaining_data is set to n less 
   than its input value. Also, if the well-known label "field_value" is 
   undefined in the input labels then for the output labels it is set 
   equal to the integer value of the field. If field_value already has 
   an integer value however, the value is left unchanged in the output 
   list of labels and the value of remaining_data is not modified. 
    
   The FIELD rule may also provide an updated value for one context 
   entry: the location of the context entry to be updated is given by 
   the well-known label "index". The FIELD rule first checks the well-
   known label "no_update" to determine whether an updated context value 
   must be provided. If no_update = 1 then the values for 
   updated_length(index) and updated_value(index) are set equal to 
   context_length(index) and context_value(index) respectively. 
   Otherwise they are set equal to the length and value of the field. 
   The output value for index is set to 1 higher than its input value. 
    
   All other labels are left unchanged by the FIELD rule, and the 
   probability value output by the FIELD rule is always 100%. 

 
Price et al.                                                   [Page 19] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
6.2.2. DISTRIBUTION 
    
   The DISTRIBUTION rule gives the probability distribution of the well-
   known label "field_value". The syntax of the DISTRIBUTION rule is 
   given below: 
    
   DISTRIBUTION(#p[]) 
    
   The rule has a list of k operands p[0],...,p[k-1] specifying the 
   probability distribution of the label. If probability[i] is the 
   chance that the label field_value will take value i then: 
    
   probability[i]   =           p[i]             for i from 0 to k-1 
                        -------------------- 
                        p[0]+p[1]+...+p[k-1] 
    
   The probability that the label will take a value outside the range 0 
   to k-1 is zero. 
    
   The DISTRIBUTION rule is successful provided that all of the k 
   operands are positive integers and that the input value for the well-
   known label field_value lies between 0 and k-1 inclusive. If this is 
   the case then the value of the label field_value and the probability 
   distribution can be passed to an encoder that requires this 
   information. The output value for the label field_value is undefined; 
   all other labels are left unchanged by the rule. 
    
   The DISTRIBUTION rule does not make use of the context and does not 
   define any updated context entries. 
    
6.2.3. MAP 
    
   The MAP rule updates the value contained in a label. The syntax of 
   the rule is given below: 
    
   $label = #old_value ? #new_value 
    
   Note that the syntax is in shorthand form, using the separators "=" 
   and "?". 
    
   The MAP rule checks whether the input value for the specified label 
   is equal to old_value. If this is the case then the output value for 
   the label is set equal to new_value. If it is not the case then the 
   MAP rule fails. All other labels are left unchanged by the rule. 
    
   The expressions for old_value and new_value can contain any label 
   names except for the label that is being updated by the MAP rule. 
    
   The MAP rule does not define any updated context entries, and the 
   probability value output by the MAP rule is always 100%. 
    

Price et al.                                                   [Page 20] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
6.2.4. CONCATENATION 
    
   The CONCATENATION rule matches a header which is the concatenation of 
   the headers matched by n existing rules. The syntax of the 
   CONCATENATION rule is given below: 
    
   CONCATENATION(rule[]) 
    
   The CONCATENATION rule also has the following shorthand form: 
    
   rule[0] whitespace ... whitespace rule[n-1] 
    
   The shorthand version of the CONCATENATION rule has the lowest 
   possible precedence (lower than any integer operators). 
    
   The operands for the CONCATENATION rule consist of n existing rules 
   denoted rule[0],...,rule[n-1]. The CONCATENATION rule applies each of 
   the existing rules to the input packet, and is considered to match 
   only if all of the n existing rules match. 
    
   When the CONCATENATION rule is invoked it is passed an input list of 
   labels and their corresponding integer values. This list is passed 
   unmodified to rule[0], which returns a (possibly updated) list of 
   labels to the CONCATENATION rule. The output list from rule[i-1] is 
   then used as the input list for rule[i] for all i from 1 to n-1 
   inclusive. Finally the output list from rule[n-1] is used unmodified 
   as the output list for the CONCATENATION rule. This situation is 
   illustrated in Figure 4: 
    
                   |                                 ^ 
            Labels |                                 | Labels 
                   v                                 | 
                 +-------------------------------------+ 
                 | CONCATENATION(rule_1,rule_2,rule_3) | 
                 +-------------------------------------+ 
                   | +-------------+ +-------------+ ^ 
            Labels | |             | |             | | Labels 
                   v |             v |             v | 
               +--------+      +--------+      +--------+ 
               | rule_1 |      | rule_2 |      | rule_3 | 
               +--------+      +--------+      +--------+ 
    
                  Figure 4: Passing labels between rules 
    
   Note that this avoids the need for an implementation to store more 
   than one copy of each label. 
    
   Any updated context entries defined by the n existing rules are 
   passed unmodified as the output of the CONCATENATION rule (unless two 
   existing rules define contradictory values for the same context 
   entry, in which case the CONCATENATION rule fails). Note that the 

 
Price et al.                                                   [Page 21] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   updated entries for the context do not overwrite the original context 
   entries while the current header is being parsed: instead they are 
   buffered and only used to replace the original context entries for 
   the next header in the flow. 
    
   The probability value output by the CONCATENATION rule is the product 
   of the probabilities output by each of the n existing rules (i.e. the 
   probabilities are assumed to be independent). 
    
    
7.  Library of commonly used rules 
    
   The ROHC [RFC-3095] standard contains a number of different 
   techniques for compressing header fields (LSB encoding, scaled 
   timestamp encoding, list-based compression etc.). Each of these 
   techniques can be added to a library of rules describing several 
   common field behaviors. The rules contained in the library can be 
   concatenated to give a description of a complete protocol stack. For 
   example, a rule describing the behavior of an IPv6 header can be 
   constructed from the rules available in the library as shown below: 
    
   IPv6_Header           ::=       Version 
                                   Traffic_Class 
                                   ECT_Flag 
                                   CE_Flag 
                                   Flow_Label 
                                   Payload_Length 
                                   Next_Header 
                                   Hop_Limit 
                                   Source_Address 
                                   Destination_Address 
    
   Version               ::=       VALUE(4,6) 
    
   Traffic_Class         ::=       STATIC 
    
   ECT_Flag              ::=       STATIC 
    
   CE_Flag               ::=       VALUE(1,0) 99% | VALUE(1,1) 1% 
    
   Flow_Label            ::=       STATIC 
    
   Payload_Length        ::=       INFERRED_SIZE(16,288) 
    
   Next_Header           ::=       LABEL(8,next_header) 
    
   Hop_Limit             ::=       STATIC 
    
   Source_Address        ::=       STATIC 
    
   Destination_Address   ::=       STATIC 

 
Price et al.                                                   [Page 22] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   The rules in the library are specified in terms of the four pre-
   defined rules, so new rules can be added to the library as and when 
   they are needed. 
    
7.1. Numeric rules 
 
   This section defines the simplest set of rules that are typically 
   used to handle numeric fields. 
    
7.1.1. RANGE 
    
   The RANGE rule describes an n-bit field that takes a known range of 
   values. The definition of the rule is given below: 
    
   RANGE(#n,#v,#k)   ::=   FIELD(n) 
                           tmp = NULL ? field_value 
                           field_value = tmp ? mod(tmp,k) 
                           tmp = mod(mod(field_value-v,k)+v,2^n) ? NULL 
                           DISTRIBUTION([1 for j in 0..k]) 
    
   The RANGE rule matches whenever the next n bits in the header (i.e. 
   the n bits following the well-known label "remaining_data") lie 
   between v and v+k-1 inclusive. The RANGE rule also assigns a 
   probability distribution to the field, specifying that each of the k 
   possible field values occurs with equal probability. 
    
   The remaining rules in this section are all defined in terms of the 
   RANGE rule. 
    
7.1.2. VALUE 
    
   The VALUE rule describes an n-bit field that takes a known value v. 
   The definition of the rule is given below: 
    
   VALUE(#n,#v)   ::=   RANGE(n,v,1) 
    
    
7.1.3. IRREGULAR 
    
   The IRREGULAR rule describes an n-bit field that behaves randomly 
   (i.e. each of the 2^n possible bit patterns occurs with equal 
   probability). The definition of the rule is given below: 
    
   IRREGULAR(#n)   ::=   RANGE(n,0,2^n) 
    
7.1.4. IRREGULAR_PADDED 
    
   The IRREGULAR_PADDED rule describes an n-bit field with k least 
   significant bits that behave randomly and with n-k most significant 
   bits that are always zero. The definition of the rule is given below: 
    

Price et al.                                                   [Page 23] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   IRREGULAR_PADDED(#n,#k)   ::=   RANGE(n,0,2^k) 
    
7.1.5. STATIC 
    
   The STATIC rule describes a field with the same length and value as 
   the corresponding field stored in the context. The definition of the 
   rule is given below: 
    
   STATIC   ::=   RANGE(context_length(index),context_value(index),1) 
    
   Note that the rule is defined in terms of the integer operators 
   context_length(#n) and context_value(#n), which respectively return 
   the length and value of the nth field stored in the context. 
    
7.1.6. LSB 
    
   The LSB rule describes a field whose value differs by a small amount 
   from the field value stored in the context. The definition of the 
   rule is given below: 
    
   LSB(#k,#p) ::= RANGE(context_length(index),context_value(index)- 
                   p,2^k) 
    
   The LSB rule can be applied when the field value lies between 
   (context_value-p) and (context_value-p+2^k-1) inclusive. In 
   particular, if p = 0 then the field value can only stay the same or 
   increase relative to the previous header in the flow. If p = -1 then 
   it can only increase, whereas if p = 2^k then it can only decrease. 
    
7.2. Labeling rules 
    
   This section lists some general rules that can be used to define new 
   labels in terms of existing labels. 
    
7.2.1. LABEL 
    
   The LABEL rule assigns a label name to an n-bit field. The definition 
   of the rule is given below: 
    
   LABEL(#n,$label)   ::=   FIELD(n) 
                            label = NULL ? field_value 
                            field_value = label ? NULL 
    
   The LABEL rule can be used in conjunction with the NEXT_FIELD rule 
   specified below to change the order in which fields are defined in a 
   PEBBLE description. This is particularly important for capturing 
   dependencies between fields that occur in different parts of the 
   header. 
    
    
Price et al.                                                   [Page 24] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
7.2.2. NEXT_FIELD 
    
   The NEXT_FIELD rule indicates that the next field to be defined is 
   not the next field in the header, but instead is a value contained in 
   a label. The definition of the rule is given below: 
    
   NEXT_FIELD($label)   ::=   field_value = NULL ? label 
                              label = field_value ? NULL 
    
   The NEXT_FIELD rule copies the value contained in the named label to 
   the well-known label field_value. When a FIELD rule is next 
   encountered it will use this value rather than taking a field value 
   from the header. 
    
7.2.3. LOCAL 
    
   The LOCAL rule temporarily deletes the value of a label. The rule 
   takes as its operands the name of a label and an existing rule: the 
   named label is set to "undefined" and then the existing rule is 
   invoked. The original label value is restored in the output of the 
   LOCAL rule. 
    
   The definition of the rule is given below: 
    
   LOCAL($label,rule)   ::=   CALL(label,label,rule) 
    
   Note that the definition of the LOCAL rule is given in terms of 
   another rule CALL which is defined as follows: 
    
   CALL(#k,$label,rule)   ::=   label = k ? NULL 
                                rule 
                                label = NULL ? k 
    
   The LOCAL rule can be useful when defining a new rule in terms of an 
   existing rule, both of which use the same name for a label. It 
   prevents the case whereby two values are assigned to the same label 
   at the same time (which results in the rule failing to match any 
   header). 
    
7.3. INFERRED rules 
    
   The INFERRED rules describe fields whose values can be inferred from 
   other fields in the header. The following INFERRED rules are 
   available: 
    
7.3.1. INFERRED_TRANSLATE 
    
   The INFERRED_TRANSLATE rule translates a field value under a certain 
   mapping. The definition of the rule is given below: 
    
    
Price et al.                                                   [Page 25] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   INFERRED_TRANSLATE(#n,#a[],#b[]) ::= FIELD(n) 
                tmp = NULL ? field_value 
                field_value = tmp ? b[min([tmp!=a[j] for j in 0..end])] 
                tmp = a[min([field_value!=b[j] for j in 0..end])] ? NULL 
    
   The first operand specifies the length of the field before the 
   translation. This is followed by additional lists of integers, 
   respectively giving the value of the field before and after it is 
   translated. For example: 
    
   GRE_protocol   ::=   INFERRED_TRANSLATE(16,0x86dd,0x0800,41,4)  
    
   The GRE Protocol field behaves in the same manner as the Next Header 
   field in other extension headers, except that it indicates that the 
   subsequent header is IPv6 or IPv4 using the values 0x86dd and 0x0800 
   instead of 41 and 4. The INFERRED_TRANSLATE rule can translate the 
   values required by the GRE Protocol field into the standard values, 
   whose behavior can then be described by another rule such as LIST. 
    
7.3.2. INFERRED_SIZE 
    
   The INFERRED_SIZE rule infers the value of a field from the total 
   amount of remaining data. The definition of the rule is given below: 
    
   INFERRED_SIZE(#n,#p) ::= VALUE(n,(remaining_data-n-p)/8) 
    
   The first operand specifies the length of the field in bits, and the 
   second operand specifies an offset that will be subtracted from the 
   total data length when deriving the value of the INFERRED_SIZE field. 
    
7.3.3. INFERRED_OFFSET 
    
   The INFERRED_OFFSET rule describes a field that takes a known offset 
   relative to a certain base value. The definition of the rule is given 
   below: 
    
   INFERRED_OFFSET(#n,$base) ::= FIELD(n) 
                           tmp = NULL ? field_value 
                           field_value = tmp ? mod(field_value-base,2^n) 
                           tmp = mod(field_value+base,2^n) ? NULL 
    
   The first operand defines the length of the field. The second operand 
   gives the name of a label where the base value can be found. The 
   value of this label is defined by another rule (e.g. the LABEL rule). 
    
   The INFERRED_OFFSET rule subtracts the base value from the field 
   value. The resulting offset is then placed in the well-known label 
   field_value so that it can be described by the next rule. For 
   example, a typical sequence number can be described as follows: 
    
    
Price et al.                                                   [Page 26] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   sequence_number   ::=   INFERRED_OFFSET(32,msn) 
                     
                           STATIC 99% |  
                           LSB(8,-1) 0.7% |  
                           LSB(16,-1) 0.2% |  
                           IRREGULAR(32) 0.1% 
    
   In this case the offset value is expected to change rarely and only 
   by small amounts, and hence it is defined using the STATIC and LSB 
   rules. 
    
7.3.4. INFERRED_SEQUENCE 
    
   The INFERRED_SEQUENCE rule describes a field that takes a known 
   offset relative to a base value, in the same way as INFERRED_OFFSET. 
   However, INFERRED_SEQUENCE additionally defines a new base value 
   equal to the value of the field. This is useful for describing 
   sequences of incrementing values. 
    
   The definition of the rule is given below: 
    
   INFERRED_SEQUENCE(#n,$base) ::= FIELD(n) 
                                   tmp = NULL ? field_value 
                                   field_value = tmp ? mod(tmp-base,2^n) 
                                   base = mod(tmp-field_value,2^n) ? tmp 
                                   tmp = base ? NULL 
    
   The first operand defines the length of the field. The second operand 
   gives the name of a label where the base value can be found. The 
   value of this label is defined by another rule (e.g. the LABEL rule). 
    
   The INFERRED_SEQUENCE rule subtracts the base value from the field 
   value. The resulting offset is then placed in the well-known label 
   field_value so that it can be described by the next rule. 
   Additionally, the base label is updated to contain the original value 
   of the field being described. For example, a sequence of values such 
   as 10, 20, 30 can be described as follows: 
    
   incrementing_sequence   ::=   LABEL(16,base_label) 
    
                                 INFERRED_SEQUENCE(16,base_label) 
                                 STATIC 90% | IRREGULAR(16) 10% 
                                          ; describes (2nd - 1st) entry 
    
                                 INFERRED_SEQUENCE(16,base_label) 
                                 STATIC 90% | IRREGULAR(16) 10% 
                                          ; describes (3rd - 2nd) entry 
    
                                 NEXT_FIELD(base_label) 
                                 STATIC 90% | IRREGULAR(16) 10% 
                                          ; describes 3rd entry 

 
Price et al.                                                   [Page 27] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
7.3.5. INFERRED_IP_CHECKSUM 
    
   The INFERRED_IP_CHECKSUM rule is used to describe the IP checksum 
   field. The definition of the rule is given below: 
    
   INFERRED_IP_CHECKSUM   ::=   tmp = NULL ? remaining_data 
                                remaining_data = tmp ? tmp+80 
                                FIELD(16) 
                                remaining_data = tmp+96 ? tmp 
                                field_value = checksum16(tmp) ? NULL 
                                tmp = remaining_data ? NULL 
    
   The rule matches whenever the 2-byte field at an offset of 10 bytes 
   from the current position in the packet contains a 16-bit one's 
   complement checksum over the remaining data.
    
7.4. Control field rules 
    
   This section provides a selection of rules for defining new control 
   fields. 
    
7.4.1. CHOICE 
    
   The CHOICE rule creates a label containing an implementation-specific 
   value. The definition of the rule is given below: 
    
   CHOICE($label)   ::=   label = NULL ? choice 
                          choice = label ? NULL 
    
7.4.2. PICK 
    
   The PICK rule allows an implementation to pick one rule from a list 
   of existing rules. The definition of the rule is given below: 
    
   PICK(rule[],#p[])   ::=   CHOICE(field_value) 
                             LOCAL(field_value,rule[field_value]) 
                             DISTRIBUTION([p[j] for j in 0..end]) 
    
   The operands for the PICK rule include a list of existing rules, 
   followed by a list of integers specifying the relative probability 
   with which each rule will be applied in a typical header flow. 
    
   The PICK rule also has the following shorthand form using the symbol 
   "|": 
    
   rule[0] whitespace p[0] | ... | rule[n-1] whitespace p[n-1] 
    
   The precedence for this shorthand form is higher than that for the 
   CONCATENATION rule, but lower than the precedence of any integer 
   operators. 
    

Price et al.                                                   [Page 28] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
7.4.3. FORMAT 
    
   The FORMAT rule describes a field which displays the same behavior as 
   the corresponding field in the previous header. The definition of the 
   rule is given below: 
    
   FORMAT(rule[])   ::=   CHOICE(field_value) 
                          LOCAL(field_value,rule[field_value]) 
                          STATIC 
    
   The FORMAT rule is followed by a list of existing rules, each 
   describing a different behavior for the field. The rule that is 
   chosen is the same rule as used for the preceding header. 
    
   For example: 
    
   IP_ID   ::=   FORMAT(SEQUENTIAL_IP_ID,RANDOM_IP_ID)  
    
   Two behaviors for the IP-ID are provided: one for an IP-ID that 
   increases sequentially, and one for a randomly behaving IP-ID. 
    
7.4.4. NBO 
    
   The NBO rule describes a field that may be in reverse byte order from 
   the rest of the data, as is sometimes seen with the IP-ID. 
    
   The definition of the rule is given below: 
    
   NBO(#n)   ::=   LABEL(n,base_field) 
                   CHOICE(nbo_flag) 
                   nbo_field = NULL ? switch(nbo_flag, 
                    byte_swap(n,base_field),base_field) 
                   base_field = switch(nbo_flag, 
                    byte_swap(n,nbo_field),nbo_field) ? NULL 
    
   The rule examines the n-bit field (where n is typically 16 or 32) and 
   generates a label nbo_flag indicating whether or not the field is in 
   network byte order. Determining whether a field is in network byte 
   order is a complex task (typically requiring the encoder to examine a 
   history of field values) so the value of the nbo_flag label is left 
   as a choice for the implementation. 
    
   If the field is in network_byte_order then the nbo_flag is set to 1 
   and a label nbo_field is created containing the original field value. 
   If the field is not in network byte order then the nbo_flag is set to 
   0 and the label nbo_field is given the byte-swapped value of the 
   original field. For example: 
    
   IP_ID   ::=   NBO(16)                           ; Checks byte order 
                 NEXT_FIELD(nbo_flag)              ; Describes nbo_flag 
                 STATIC 99% | IRREGULAR(1) 1% 

 
Price et al.                                                   [Page 29] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
                 NEXT_FIELD(nbo_field)             ; Describes nbo_field 
                 INFERRED_OFFSET(16,msn) 
                 STATIC 80% | LSB(5,-1) 15% | IRREGULAR(16) 5% 
    
7.4.5. SCALE 
    
   The SCALE rule describes a field which changes by a regular (usually 
   large) amount in consecutive headers. The definition of the rule is 
   given below: 
    
   SCALE(#n) ::= LABEL(n,base_field) 
                 CHOICE(scale) 
                 scaled_field = NULL ? floor(base_field,scale) 
                 remainder_field = NULL ? mod(base_field,scale) 
                 base_field = scale*scaled_field+remainder_field ? NULL 
    
   The number of bits specified are examined and a suitable scale factor 
   is chosen (typically using a history of field values). The original 
   field value is then mapped to two values: the scaled value obtained 
   by dividing the field by the chosen scale factor, and the remainder 
   value from this division. These two values plus the scale factor 
   itself are sufficient to reconstruct the original field value. 
            
7.4.6. CHECKSUM 
    
   The CHECKSUM rule describes a checksum calculated across the header. 
   The size of the checksum can be altered depending on the 
   characteristics of the link over which the protocol is to be 
   transmitted. 
    
   The definition of the rule is given below: 
    
   CHECKSUM(#n)   ::=   NEXT_FIELD(checksum) 
                        IRREGULAR(n) 
    
   Note that it is possible to offer different checksum lengths, so 
   extra checksum bits can be allocated to protect important headers. 
   This is illustrated in the example below: 
    
   checksum_field   ::=   CHECKSUM(3) 99% | CHECKSUM(7) 1% 
    
7.5. List rules 
    
   The behavior of many protocol fields and headers can be described by 
   a single rule built up from lists of simpler rules. This section 
   defines a set of rules that can be used to describe the behavior of 
   these lists. 
    
    
Price et al.                                                   [Page 30] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
7.5.1. LIST 
    
   The LIST rule describes a list of items that do not necessarily occur 
   in the same order for every header. Example applications for the LIST 
   rule include TCP options and TCP SACK blocks. 
    
   The definition of the rule is given below: 
    
   LIST($label,#d,#m,#p,rule[]) ::= start = NULL ? remaining_data 
                           order = NULL ? 0 
                           presence = NULL ? 0 
                           CONCATENATION([LIST_ITEM for j in 0..end]) 
                           start = switch(permutation(end,k,order), 
                            -1,remaining_data+floor(label,d)*m+p) ? NULL 
    
   where k is log2(end) 
    
         LIST_ITEM is CHOICE(next_rule) 
                      CHOICE(optional) 
                      rule[next_rule] 
    
                      tmp = NULL ? order 
                      order = tmp ? tmp*2^k+next_rule 
                      next_rule = mod(order,2^k) ? NULL 
                      tmp = floor(order,2^k) ? NULL 
    
                      tmp = NULL ? presence 
                      presence = tmp ? tmp*2+optional 
                      optional = mod(presence,2) ? NULL 
                      tmp = floor(presence,2) ? NULL 
    
   The first operand gives the name of a label, whilst the next three 
   operands specify a divisor, multiplier and offset which scale the 
   value of the label so that it specifies the exact size of the list in 
   bits. 
    
   These operands are followed by a set of n existing rules that can be 
   used to describe individual items in the list. 
    
   Once the total list size is reached, the LIST rule outputs well-known 
   labels "order" and "presence". The order consists of a string of n * 
   k-bit entries (where k is the least number of bits required to index 
   each of the n entries). The order list contains the indices in the 
   order in which they occurred. The presence data is a list a 1-bit 
   flags, one per entry in the list, which are set to "1" to indicate 
   that the list item is present or "0" if not. These flags occur in the 
   order in which the entries appear in the LIST rule. 
    
   It is expected that the OPTIONAL rule (see Section 7.6.1) might be 
   used within LIST. The well-known label "optional" is thus set 


Price et al.                                                   [Page 31] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   automatically within the LIST rule, so that it can be used in 
   conjunction with the OPTIONAL rule as illustrated below: 
    
   TCP_Options   ::=   LIST(tcp_data_offset,1,32,0,   
                        OPTIONAL(TCP_SACK),   
                        OPTIONAL(TCP_TIMESTAMP),   
                        OPTIONAL(TCP_END),   
                        OPTIONAL(TCP_GENERIC)) 
    
                       STATIC 99.9% |             ; TCP Options order 
                       IRREGULAR(8) 0.1% 
    
                       STATIC 75% |               ; TCP Options presence 
                       IRREGULAR(4) 25% 
    
   The LIST rule implicitly describes the behavior of the label that is 
   specified in the first operand, so there is no need to explicitly 
   describe this label using a NEXT_FIELD rule. 
    
7.5.2. LIST_N 
    
   The LIST_N rule is similar to the LIST rule, with the addition that 
   each of the rules comprising the list can be applied multiple times. 
    
   The definition of the rule is given below: 
    
   LIST_N($label,#d,#m,#p,rule[],#number[]) ::=  
          LIST(label,d,m,p,[rule[j] for j in 0..end, k in 0..number[j]]) 
    
   LIST_N is a useful shorthand in the case that a rule in the list may 
   occur a large number of times. For parsing purposes, a count of 
   number[k] is equivalent to having written out rule[k] a total of 
   number[k] times. 
    
   This form of specification is chosen over a truly open list (i.e. one 
   allowing an unbounded number of repeats) since it is clear that it 
   should always be possible to set an upper bound, and moreover this 
   information may be useful to certain encoders. For example: 
    
   TCP_Options   ::=   LIST_N(tcp_data_offset,1,32,0, 
                        OPTIONAL(TCP_SACK), 
                        OPTIONAL(TCP_TIMESTAMP), 
                        OPTIONAL(TCP_END), 
                        OPTIONAL(TCP_PAD), 
                        OPTIONAL(TCP_GENERIC), 
                        1,1,1,32,3) 
    
7.6. Miscellaneous rules 
    
   This section introduces some miscellaneous rules that can be used to 
   describe fields in a protocol header. 

 
Price et al.                                                   [Page 32] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
7.6.1. OPTIONAL 
    
   The OPTIONAL rule is used to describe fields that are optionally 
   present in the header. The definition of the rule is given below: 
    
   OPTIONAL(rule)   ::=   CONCATENATION([rule for j in 0..optional-1]) 
    
   The OPTIONAL rule uses the well-known label "optional" to specify 
   whether or not the optional field is present in the header. The value 
   of the label can be set by another rule such as LABEL or LIST. For 
   example: 
    
   GRE_Key_Flag   ::=   LABEL(1,optional) 
    
   GRE_Key        ::=   OPTIONAL(Key_Present) 
    
   In this case the Key_Present rule is called to describe the GRE_Key 
   field, but only if the GRE_Key Flag field is set to 1. 
    
7.6.2. UNCOMPRESSIBLE 
    
   The UNCOMPRESSIBLE rule is a special case of the IRREGULAR rule, 
   where the length of the field can be derived from the value contained 
   in a label. The definition of the rule is given below: 
    
   UNCOMPRESSIBLE($label,#d,#m,#p)   ::=   IRREGULAR(floor(label,d)*m+p) 
    
   The first operand gives the name of the label, whilst the next three 
   operands specify a divisor, multiplier and offset which scale the 
   value of the label so that it specifies the exact size of the 
   UNCOMPRESSIBLE field in bits. 
    
   The UNCOMPRESSIBLE rule is typically used in conjunction with the 
   LABEL rule. 
    
7.6.3. SKIP 
    
   The SKIP rule skips past a certain field in the protocol header 
   without defining its behavior. The definition of the rule is given 
   below: 
    
   SKIP(#n)   ::=   tmp = NULL ? remaining_data 
                    remaining_data = tmp ? tmp+n 
                    tmp = remaining_data-n ? NULL 
    
7.6.4. NO_UPDATE 
    
   The NO_UPDATE rule behaves as the rule specified by its operand, with 
   the exception that it does not update the context. This is useful 
   when a field exhibits an unusual behavior for one header and then 
   reverts back to its original behavior in subsequent headers. 

 
Price et al.                                                   [Page 33] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   The definition of the rule is given below: 
    
   NO_UPDATE(rule)   ::=   no_update = NULL ? 1 
                           rule 
                           no_update = 1 ? NULL 
    
   An example of the NO_UPDATE rule in use is given below: 
    
   TCP_Window   ::=   STATIC 99% |  
                      NO_UPDATE(LSB(11,2048)) 0.9% |  
                      IRREGULAR(16) 0.1% 
    
   In the above example the NO_UPDATE rule is applied to the TCP Window 
   field. The field can behave as defined by the LSB rule, but when this 
   behavior occurs it does not update the context. 
    
    
8.  Security considerations 
    
   This draft describes an abstract notation similar to ABNF [RFC-2234], 
   and hence is not believed to raise any security issues. 
    
    
9.  Acknowledgements 
    
   A number of important concepts and ideas have been borrowed from ROHC 
   [RFC-3095]. Updates to the LIST rules owe much to discussions with 
   Qian Zhang and Hongbin Liao. 
    
   Thanks to Paul Ollis for field labeling; and to Rob Hancock and 
   Stephen McCann for putting up with the authors' arguments and making 
   helpful suggestions, frequently against the tide! 
    
   The authors would also like to thank Carsten Bormann, Christian 
   Schmidt, Max Riegel and Lars-Erik Jonsson for their comments and 
   encouragement. We haven't always agreed, but the arguments have been 
   fun! 
    
    
10. Authors' addresses 
    
   Richard Price         Tel: +44 1794 833681 
   Email:                richard.price@roke.co.uk 
    
   Abigail Surtees       Tel: +44 1794 833131 
   Email:                abigail.surtees@roke.co.uk 
    
   Mark A West           Tel: +44 1794 833311 
   Email:                mark.a.west@roke.co.uk 
    
    
Price et al.                                                   [Page 34] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   Roke Manor Research Ltd 
   Romsey, Hants, SO51 0ZN 
   United Kingdom 
   http://www.roke.co.uk 
    
    
11. References 
 
   [RFC-2026]  "The Internet Standards Process - Revision 3", Scott 
               Bradner, RFC 2026, Internet Engineering Task Force,  
               October 1996 
    
   [RFC-2119]  "Key words for use in RFCs to Indicate Requirement 
               Levels", Scott Bradner, RFC 2119, Internet Engineering  
               Task Force, March 1997 
    
   [RFC-2234]  "Augmented BNF for Syntax Specifications: ABNF",   
               D. Crocker and P. Overell, RFC 2234, Internet Engineering  
               Task Force, November 1997 
    
   [RFC-2960]  "Stream Control Transmission Protocol", Stewart et al,  
               RFC 2960, Internet Engineering Task Force, October 2000 
    
   [RFC-3095]  "RObust Header Compression (ROHC)", Carsten Bormann et 
               al, RFC3095, Internet Engineering Task Force, July 2001 
    
   [RFC-3261]  "SIP: Session Initiation Protocol", J. Rosenberg et al,  
               RFC 3261, Internet Engineering Task Force, June 2002 
    
    
Appendix A. ABNF description of PEBBLE 
    
   definition          ::=   new_rule ws "::=" ws expression  
                              [ws "where" 1*(ws local_definition)] 
    
   new_rule            ::=   name [ws] ["(" operand 
                              *("," [ws] operand) ")"] 
    
   name                ::=   *(letter | digit | "_") 
    
   letter              ::=   %x41-5a | %x61-7a 
    
   digit               ::=   %x30-39 
    
   operand             ::=   [type] name ["[" [name] "]"] 
    
   type                ::=   "#" | "$" 
    
   expression          ::=   operator | reference | repetition 
    
   operator            ::=   longhand_form | shorthand_form 

 
Price et al.                                                   [Page 35] 

INTERNET-DRAFT                   PEBBLE                 October 28, 2002 
 
 
   longhand_form       ::=   name [ws] ["(" expression 
                              *("," [ws] expression) ")"] 
    
   shorthand_form      ::=   [expression] *(reserved 
                              expression) [reserved] 
    
   reserved            ::=   1*(%x00-ff) 
    
   reference           ::=   name ["[" expression "]"] 
    
   repetition          ::=   "[" expression ws "for" ws index 
                              *("," [ws] index "]" 
    
   index               ::=   name ws "in" ws expression ".." expression 
    
   local_definition    ::=   name ws "is" ws expression 


Price et al.                                                   [Page 36]