Network Working Group Richard Price, Siemens/Roke Manor INTERNET-DRAFT Robert Hancock, Siemens/Roke Manor Expires: January 2002 Stephen McCann, Siemens/Roke Manor Mark A West, Siemens/Roke Manor Abigail Surtees, Siemens/Roke Manor Paul Ollis, Siemens/Roke Manor 9 July, 2001 TCP/IP Compression for ROHC Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of [RFC-2026]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This document is a submission to the IETF ROHC WG. Comments should be directed to the mailing list of ROHC, rohc@cdt.luth.se. Abstract This draft describes a ROHC profile for the robust compression of TCP/IP. The RObust Header Compression [ROHC] scheme is designed to compress packet headers over error prone channels. It is built around an extensible core framework that can be tailored to compress new protocol stacks by adding additional ROHC profiles. The new profile for TCP/IP compression is provided by the Efficient Protocol Independent Compression (EPIC-LITE) scheme. Price et al. [PAGE 1] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 Table of contents 1. Introduction.................................................3 2. Terminology..................................................4 3. The EPIC-LITE scheme for generating new ROHC profiles........5 3.1. Structure of the EPIC-LITE compressed headers...............5 3.2. Compression and decompression procedures...................6 3.3. Language for creating new ROHC profiles....................8 3.4. Huffman compression........................................8 4. Overview of the input language for EPIC-LITE.................9 4.1. Information stored at compressor and decompressor..........11 4.2. Control fields.............................................12 5. Packet types available to EPIC-LITE..........................12 5.1. CO packet..................................................13 5.2. IR-DYN packet..............................................14 5.3. IR packet..................................................14 6. Library of EPIC-LITE encoding methods........................15 6.1. STATIC.....................................................15 6.2. IRREGULAR..................................................17 6.3. VALUE......................................................17 6.4. LSB........................................................17 6.5. UNCOMPRESSED...............................................18 6.6. INFERRED...................................................18 6.7. OPTIONAL...................................................22 6.8. CONTEXT....................................................22 6.9. LIST.......................................................23 6.10. Flag encoding methods......................................24 6.11. FORMAT.....................................................25 6.12. CRC........................................................26 7. Creating a new ROHC profile..................................27 7.1. Profile identifier.........................................27 7.2. Maximum number of header formats...........................27 7.3. Control of header alignment................................28 7.4. Compressed header formats..................................28 8. EPIC-LITE state machine......................................28 8.1. Compression and decompression states.......................29 8.2. Compressor states..........................................29 8.3. Decompressor states........................................30 9. ROHC Profile for compression of TCP/IP.......................31 10. Security considerations......................................41 11. Acknowledgements.............................................41 12. References...................................................42 13. Authors' addresses...........................................42 Appendix A. EPIC-LITE compressor and decompressor...............43 A.1. Compressor.................................................45 A.2. Decompressor...............................................49 A.3. Offline processing.........................................51 A.4. Library of encoding methods................................56 A.5. BNF description of the input language......................67 Appendix B. Extensibility.......................................68 B.1. Other protocol stacks......................................68 B.2. New library encoding methods...............................69 B.3. Learning version of EPIC-LITE..............................69 Price et al. [PAGE 2] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 1. Introduction This document describes a method for compressing TCP/IP headers within the [ROHC] framework. The first part of the draft describes the Efficient Protocol Independent Compression (EPIC-LITE) scheme for generating new ROHC profiles. This scheme takes as its input a list of fields in the protocol stack to be compressed, and for each field a choice of one or more compression techniques. Using this input EPIC-LITE derives a set of compressed header formats that can be used to quickly and efficiently compress and decompress headers. The profiles generated by EPIC-LITE have comparable efficiency to the RTP profile presented in [ROHC]. Moreover we make no IPR claims on the material presented in this draft. A companion draft [EPIC] describes the delta changes required to implement an alternative version of the scheme. The EPIC scheme compresses headers with greater efficiency than EPIC-LITE: in fact for the chosen level of robustness the compression ratio is provably optimal (i.e. the compressed headers are the smallest size possible). The second part of the draft uses EPIC-LITE to generate a ROHC profile for the robust compression of TCP/IP. Chapter 2 explains some of the terminology used in the draft. Chapter 3 gives an overview of the EPIC-LITE scheme. Chapter 4 describes the language used by EPIC-LITE to create new profiles. Chapter 5 lists the packet types available to EPIC-LITE. Chapter 6 considers the basic techniques available in the EPIC-LITE library of compression routines. Chapter 7 specifies the parameters used to define a [ROHC] profile. Chapter 8 contains the state machine that controls the EPIC- LITE compressor and decompressor. Chapter 9 lists an EPIC-LITE profile for the compression of TCP/IP. Appendix A gives a normative description of EPIC-LITE in pseudocode. Appendix B considers the extensibility of the scheme. Price et al. [PAGE 3] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC-2119]. Profile A [ROHC] profile is a description of how to compress a certain protocol stack over a certain type of link. Each profile includes one or more sets of compressed header formats and a state machine to control the compressor and the decompressor. Context The context is an array storing one or more previous values of each field in the uncompressed header. The compressor and decompressor both maintain a copy of the context, and fields can be compressed relative to their stored values for better compression efficiency. Compressed header format A compressed header format describes how to compress each field in the chosen protocol stack. It consists of two parts: a bit pattern to indicate to the decompressor which format is being used, followed by a list of the compressed versions of each field. Encoding method An encoding method is a procedure for compressing fields. Examples include STATIC encoding (field is the same as the context), INFERRED encoding (field is calculated at the decompressor) and IRREGULAR encoding (field must be transmitted in full). Indicator flags Each EPIC-LITE compressed packet contains a set of indicator flags. The flags are placed at the front of the packet as a single bit pattern, and indicate to the decompressor exactly which encoding method has been applied to which field. Set of compressed header formats A complete set of compressed header formats uses up all of the indicator bit patterns available at the start of the compressed header. A profile may have several sets of compressed header formats available, but only one set can be in use at a given time. Library of encoding methods The library of encoding methods contains a number of commonly used procedures that can be called to compress fields in the chosen protocol stack. More encoding methods can be added to the library if they are needed. Price et al. [PAGE 4] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 Input language EPIC-LITE offers a simple input language (2 commands) that can be used to create new ROHC profiles. The input language assigns one or more encoding methods to each field in the chosen protocol stack. Control field The term 'control field' refers to any field passed between the compressor and decompressor, that is not part of the uncompressed header. An example of a control field is the header checksum calculated by EPIC-LITE over the uncompressed header to ensure robustness against bit errors and dropped packets. 3. The EPIC-LITE scheme for generating new ROHC profiles This chapter outlines the EPIC-LITE scheme that has been used to generate a [ROHC] profile for TCP/IP compression. 3.1. Structure of the EPIC-LITE compressed headers The compressed headers generated by EPIC-LITE are modeled closely on [RFC-1144] for simplicity and ease of parsing. Each compressed header is divided into two distinct parts: the indicator flags and the compressed fields as illustrated below: +---+---+---+---+--------------------+--------------------+--- | 0 | 1 | 1 | 0 | Compressed Field 1 | Compressed Field 2 |... +---+---+---+---+--------------------+--------------------+--- \ \ / / \ / \ \ / / \ / Indicator Flags Compressed Fields Figure 1 : Structure of an EPIC-LITE compressed header The indicator flags specify which compressed fields are present in the header, whilst the compressed fields contain enough information to transmit each field from the compressor to the decompressor. This information might be the entire uncompressed field, it might be LSBs (Least Significant Bits) of the uncompressed field etc. Note that for simplicity EPIC-LITE always places the indicator flags at the front of the compressed header followed by each complete compressed field in turn. EPIC-LITE never splits the compressed fields up across multiple parts of the header. As for [RFC-1144] the compressed fields are in reverse order compared to the uncompressed header (this is a useful trick to speed up parsing at the decompressor). Unlike other compression schemes, the header formats used by EPIC- LITE are not designed by hand but instead are generated automatically using a special algorithm. This means that EPIC-LITE can be applied Price et al. [PAGE 5] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 to any protocol stack provided that it has been correctly programmed using the input language described in Section 3.3. 3.2. Compression and decompression procedures Figure 2a illustrates the processing which is done by EPIC-LITE for each header to be compressed and decompressed. Note that references are given to pseudocode in Appendix A which describes each of the stages explicitly. | Uncompressed packet Uncompressed packet ^ v | +-------------------+ +-------------------+ | Selecting the | | Verifying correct | | context | | decompression | | (Section A.1.1) | | (Section A.2.5) | +-------------------+ +-------------------+ | | | Uncompressed packet Uncompressed packet | v Context | +-------------------+ +-------------------+ | Running the | | Decompressing | | state machine | | the fields | | (Section A.1.2) | | (Section A.2.4) | +-------------------+ +-------------------+ | Set of header formats Chosen format ^ | Uncompressed packet Compressed fields | v Context Context | +-------------------+ +-------------------+ | Compressing | | Reading the | | the fields | | indicator flags | | (Section A.1.3) | | (Section A.2.3) | +-------------------+ +-------------------+ | ^ | Chosen format Compressed header | v Compressed fields Context | +-------------------+ +-------------------+ | Determining the | | Running the | | indicator flags | | state machine | | (Section A.1.4) | | (Section A.2.2) | +-------------------+ +-------------------+ | ^ | Compressed header Compressed header | v Context | +-------------------+ +-------------------+ | Encapsulating in | |Decapsulating from | | ROHC packet | ---------------> | ROHC packet | | (Section A.1.5) | ROHC | (Section A.2.1) | +-------------------+ packet +-------------------+ Figure 2a : EPIC-LITE compression/decompression Each of these steps is described in more detail below. Price et al. [PAGE 6] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 At the compressor: Step 1: The input to the compressor is simply an uncompressed packet (with known length). In order to compress the packet it is first necessary to classify it and choose the context relative to which the packet will be compressed. If no suitable context is available then an existing context must be overwritten. Step 2: Once the context has been chosen the compressor knows which [ROHC] profile will be used to compress the packet. In particular it can run the state machine that determines which set of compressed header formats will be used to compress the packet. Step 3: Given the uncompressed packet and a set of compressed header formats, the compressor can choose a header format to robustly carry this information to the decompressor using as few bits as possible. Note that EPIC-LITE chooses the header format simultaneously with compressing the header to improve the overall speed of compression. Step 4: Each compressed header format has a unique set of indicator flags that communicate to the decompressor which format is in use. The compressor determines these indicator flags and appends them to the front of the compressed fields to create a compressed header. Step 5: Once the compressed header has been calculated, the compressor encapsulates it within a ROHC packet by adding 0 or more octets of context identifier together with any padding and segmentation that is required. At the decompressor: Step 1: The input to the decompressor is a ROHC compressed packet. From this packet the decompressor can determine the attached context identifier, which in turn specifies the context relative to which the packet should be decompressed. Step 2: Once the context has been identified, the decompressor can run the state machine to determine whether the packet may be decompressed. Step 3: The decompressor then reads the indicator flags in the header to determine which compressed header format has been used. This allows the compressed value of each field to be extracted. Step 4: Using the compressed value of each field and the context, the decompressor can apply the encoding methods to reconstruct the uncompressed packet. Note that fields are decompressed in reverse order to compression (this ensures that fields which are inferred from other field values are reconstructed correctly). Step 5: Finally, the decompressor verifies that correct decompression has occurred by applying the header checksum. If the packet is successfully verified then it can be forwarded. Price et al. [PAGE 7] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 3.3. Language for creating new ROHC profiles EPIC-LITE is a protocol-independent compression scheme because it can generate new compressed header formats automatically using a special algorithm. In order for EPIP-LITE to compress a new protocol stack however, it must first be given a description of how the stack behaves. EPIC-LITE includes a simple language for the fast creation of new compression schemes. The language is designed to be easy to use without detailed knowledge of the mathematics underlying EPIC-LITE. The only information required to create a new [ROHC] profile using EPIC-LITE is a description of how the chosen protocol stack behaves. EPIC-LITE converts the input code into a set of compressed header formats that can be used by a [ROHC] compressor and decompressor. The language has a procedure-based structure, which makes it easy to build new profiles out of existing ones (e.g. when adding new layers to a protocol stack). Figure 2b describes the process of building a set of compressed header formats from the input code, which is done once only and the results stored at the compressor and decompressor. References are given to pseudocode in Appendix A which describes the various stages explicitly. +-----------------------+ | Input Stage 1: | | Resolve input | | into set of compressed| | header formats | | (Section A.3.1) | +-----------------------+ | v +-----------------------+ | Input Stage 2: | | Run Huffman algorithm | | to generate flags | | (Section A.3.2) | +-----------------------+ Figure 2b : Building EPIC-LITE compressed header formats Note that since the EPIC-LITE compressed header formats can be generated offline, the fact that profiles are specified using an input language does not affect the processing requirements of compression and decompression. 3.4. Huffman compression Huffman compression [HUFF] is a well known technique used in many popular compression schemes. EPIC-LITE uses ordinary Huffman compression to generate a new set of compressed header formats. Price et al. [PAGE 8] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 The basic Huffman algorithm is designed to compress an arbitrary stream of characters from a character set such as ASCII. The idea is to create a new set of compressed characters, where each normal character maps onto a compressed character and vice versa. Common characters are given shorter compressed equivalents than rarely used characters, reducing the average size of the data stream. EPIC-LITE uses Huffman compression to generate the indicator flags for each compressed header format. Each format is treated as one character in the Huffman character set, so more common compressed header formats are indicated using fewer bits than rare header formats. The most commonly used header format is often indicated by the presence of a single "0" flag at the front of the compressed header. The following chapters describe the mechanisms of EPIC-LITE in greater detail. 4. Overview of the input language for EPIC-LITE This chapter describes the input language provided by EPIC-LITE for the creation of new [ROHC] profiles. The language is designed to be flexible but at the same time easy to use without detailed knowledge of the mathematics underlying EPIC- LITE. The only requirement for writing an efficient EPIC-LITE profile is a description of how the relevant protocol stack behaves. The language has just two commands: encode: Used to assign one or more encoding methods to a field method: Used to create new encoding methods from existing ones EPIC-LITE also contains a library of basic encoding methods (STATIC compression, LSB compression etc.). The purpose of the input language is to assign one or more encoding methods to each field in the chosen protocol stack. This is accomplished using the "encode" command as follows: "encode" "as" {"or" } Each instance of calls an encoding method that converts a certain number of uncompressed bits into a certain number of compressed bits. Encoding methods can be taken from the standard library or can be defined from existing encoding methods by using the following command: "method" {} "end_method" Note that is simply an instance of the "encode" command described previously. Price et al. [PAGE 9] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 An example of the "method" command is given below. A new encoding method known as IPv6-ENCODING is constructed from the basic encoding methods available in the library. This new method is designed to compress an entire IPv6 header. method IPv6-ENCODING encode Version as STATIC-KNOWN(4,6) encode Traffic_Class as STATIC 99.9% C or IRREGULAR(6) 0.1% encode ECT_Flag as STATIC 99.9% C or IRREGULAR(1) 0.1% encode CE_Flag as VALUE(1,0) 99% C or VALUE(1,1) 1% encode Flow_Label as STATIC-UNKNOWN(20) encode Payload_Length as INFERRED-SIZE(16,288) encode Next_Header as INFERRED(8) encode Hop_Limit as STATIC 99% C or IRREGULAR(8) 1% encode Source_Address as STATIC-UNKNOWN(128) encode Destination_Address as STATIC-UNKNOWN(128) end_method Note that the normative version of IPv6-ENCODING can be found in Chapter 9. Also note that the names given to each field are not important (as in fact EPIC-LITE replaces the field names with integer values for ease of parsing). For each field IPv6-ENCODING includes a list of possible encoding methods. If more than one method is assigned to a field, the name of the encoding method includes the probability that it will be used to encode the field in question. This is very important since EPIC-LITE ensures that common encoding methods require fewer bits to carry the compressed data than rarely used encoding methods. For optimal compression, the probability should equal the percentage of time for which the encoding method is selected to compress the field. Note that there is no requirement for probabilities to add up to exactly 100%, as EPIC-LITE will automatically scale the probabilities by a constant factor if they do not. Note also that the input code is designed to be both human-readable and machine-readable. If only one protocol stack needs to be compressed, the input code can simply be converted by hand directly Price et al. [PAGE 10] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 to an implementation language. However, since the input language provides a complete description of the protocol stack to be compressed, it is possible to compress headers using only the information contained in the input code and without any additional knowledge of the protocol stack. This means that it is possible to implement a protocol-independent compressor that can download a new [ROHC] profile in the form of input code and immediately use it to compress headers. Appendix B gives more information on this "learning" version of EPIC-LITE. 4.1. Information stored at compressor and decompressor Any ROHC compressor maintains a number of contexts as described in Section 5.1.3 of [ROHC]. Each context at the compressor and decompressor includes the following: Compression profile: Compressed header formats State machine Field values: One or more previously processed headers The compression profile describes how to compress a certain protocol stack over a certain type of link. It includes the profile parameters that describe the set of compressed header formats (as discussed in Chapter 7) and additionally records the current state of the state machine (as discussed in Chapter 8). The compressor also stores one or more sets of field values from previously processed headers. Each new header can be compressed relative to these field values to improve the compression ratio. For the profiles generated using EPIC-LITE, the compressor and decompressor maintain a context value for every field specified in the input code. This value is taken from the last header to be successfully compressed or decompressed. Furthermore, in order to provide robustness the compressor can maintain more than one context value for each field. These values represent the r most likely candidates values for the context at the decompressor, given that bit errors and dropped packets may prevent the compressor from being 100% certain exactly which values are contained in the decompressor context. EPIC-LITE ensures that the compressed header contains enough information so that the uncompressed header can be extracted no matter which one of the compressor context values is actually stored at the decompressor. The only problem arises if the decompressor has a context value that does not belong to the set of values stored at the compressor; this situation is detected by a checksum over the uncompressed header and the packet is discarded at the decompressor. If more than one value for a field is stored in the compressor context, EPIC-LITE restricts the available encoding methods for the field. STATIC encoding will not be available, and LSB encoding will Price et al. [PAGE 11] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 only be allowed if sufficient LSBs are sent to infer the field regardless of the precise value stored in the decompressor context. The following definitions are made: context[current_field, i] The ith value of current_field stored in the context at the compressor context[current_field] The value of current_field stored in the context at the decompressor Note that the rules for extracting fields from the uncompressed header and updating the context values are given in Appendix A. The number of context values per field to be stored at the compressor is implementation-specific. Storing more values reduces the chance that the decompressor will have a context value different from any of the values stored at the compressor (which could cause the packet to be decompressed incorrectly). The trade-off is that the compressed header will be larger because it must contain enough information to decompress relative to any of the candidate context values. As an example, an implementation may choose to store the last r values of each field in the compressor context. In this case robustness is guaranteed against up to r - 1 consecutive dropped packets between the compressor and the decompressor. 4.2. Control fields A control field is a field that is passed from the compressor to the decompressor, but which is not present in the uncompressed header. Control fields are used to communicate additional information that might be useful to the decompressor: for example a checksum over the uncompressed header to ensure correct decompression has occurred. Control fields are created by certain encoding methods (as described in Chapter 6). When a new control field is created it is appended to a heap containing all of the currently outstanding control fields. Control fields are also used by certain encoding methods: when an encoding method requires a control field it simply takes the top field value from the heap. Note that the use of a heap avoids the need to give each control field an explicit name. 5. Packet types available to EPIC-LITE In addition to the standard CO (compressed) packets, the [ROHC] framework contains two special packet types designed to help synchronize the context at the compressor and decompressor. An IR (Initialization and Refresh) packet associates a context with a certain ROHC profile, and transmits the value of all fields including those which remain constant throughout the lifetime of the context. An IR-DYN (Dynamic Initialization and Refresh) packet associates a context with a ROHC profile, and additionally transmits the value of Price et al. [PAGE 12] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 any fields except those that remain constant for the lifetime of the context. An IR-DYN packet cannot be used to completely initialize a new context, but it is usually smaller than a full IR packet. [ROHC] also defines a general compressed packet that can be used to encapsulate CO, IR and IR-DYN packets. The general packet format includes a CID (Context Identifier) to indicate the context to which the compressed packet belongs. It also includes a packet type indicator to specify whether the packet is a feedback, initialization or general compressed packet, whether it is segmented, and whether it contains padding. The following packet type indicators are reserved in the overall [ROHC] framework: 1110: Padding or Add-CID octet 11110: Feedback 11111000: IR-DYN packet 1111110: IR packet 1111111: Segment Any packet types not indicated by the bit pattern 111XXXXX can be used by individual [ROHC] profiles such as the TCP/IP profile. 5.1. CO packet The compressed (CO) packet type is the basic compressed packet offered by EPIC-LITE. CO packets can be used to transmit data between the compressor and decompressor with a high level of efficiency, and can cope with most irregularities in the packet stream. The location of an EPIC-LITE CO packet within the general ROHC packet is shown below: 0 7 --- --- --- --- --- --- --- --- | Add-CID octet | if for CID 1-15 and small CIDs +---+--- --- --- ---+--- --- ---+ | EPIC-LITE CO packet | 1 octet +---+--- ---+---+---+--- --- ---+ | | / 0, 1, or 2 octets of CID / 1 or 2 octets if large CIDs | | +---+---+---+---+---+---+---+---+ / EPIC-LITE CO packet / variable +---+---+---+---+---+---+---+---+ Figure 3 : Format of CO packet generated by EPIC-LITE Note that CO packets are decompressed relative to the context stored at the decompressor. If the compressor has not yet initialized this context, or suspects that it has become invalidated, then a CO packet cannot be sent. Price et al. [PAGE 13] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 5.2. IR-DYN packet The structure of the IR-DYN packet used by EPIC-LITE is shown below: 0 1 2 3 4 5 6 7 --- --- --- --- --- --- --- --- : Add-CID octet : if for CID 1-15 and small CIDs +---+---+---+---+---+---+---+---+ | 1 1 1 1 1 0 0 | 0 | IR-DYN type octet +---+---+---+---+---+---+---+---+ : : / 0-2 octets of CID / 1-2 octets if for large CIDs : : +---+---+---+---+---+---+---+---+ | Profile | 1 octet +---+---+---+---+---+---+---+---+ | CRC | 1 octet +---+---+---+---+---+---+---+---+ | | / EPIC-LITE IR-DYN packet / variable length | | +---+---+---+---+---+---+---+---+ Figure 4 : Format of IR-DYN packet generated by EPIC-LITE The Profile field associates the context with a certain profile. It transmits the 8 least significant bits of the EPIC-LITE profile_identifier parameter described in Section 7.1. Furthermore, the polynomial used to calculate the CRC is defined in Section 6.12. 5.3. IR packet The structure of the IR packet used by EPIC-LITE is shown below: 0 1 2 3 4 5 6 7 --- --- --- --- --- --- --- --- : Add-CID octet : if for CID 1-15 and small CIDs +---+---+---+---+---+---+---+---+ | 1 1 1 1 1 1 0 | D | IR type octet +---+---+---+---+---+---+---+---+ : : / 0-2 octets of CID / 1-2 octets if for large CIDs : : +---+---+---+---+---+---+---+---+ | Profile | 1 octet +---+---+---+---+---+---+---+---+ | CRC | 1 octet +---+---+---+---+---+---+---+---+ | | / EPIC-LITE IR packet / variable length | | +---+---+---+---+---+---+---+---+ Figure 5 : Format of IR packet generated by EPIC-LITE Price et al. [PAGE 14] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 Note that the D bit is currently always set to 1 (as specified in [ROHC]), since the IR packet generated by EPIC-LITE always compresses every field in the header. A version of the IR packet that only compresses static fields may be introduced in future. 6. Library of EPIC-LITE encoding methods The [ROHC] standard contains a number of different encoding methods (LSB encoding, scaled timestamp encoding, list-based compression etc.) for compressing header fields. EPIC-LITE treats these encoding methods as library functions to be called by the input language when they are needed. The following library contains all of the basic encoding methods required for robust and efficient TCP/IP compression. Moreover new encoding methods can be added to the library as and when they are needed. Note that this chapter contains an informative description only. The normative pseudocode description of every encoding method can be found in Appendix A.4. Each of the encoding methods may have one or more parameters of the following type: A non-negative integer (specified as decimal, binary or hex). Binary values are prefixed by 0b and hex values are preceded by 0x. An integer (positive or negative) A non-negative integer used to indicate the length of the field being compressed A probability expressed as a percentage with at most 2 decimal places The name of another encoding method including all parameters The BNF description of every parameter may be found in Appendix A.5. 6.1. STATIC BNF notation: STATIC The STATIC encoding method can be used when the header field does not change relative to the context. If a field is STATIC then no information concerning the field need be transmitted in the compressed header. The only parameter for the STATIC encoding method is a probability that indicates how often the encoding will be used. Encoding methods with high probability values require fewer bits in the compressed Price et al. [PAGE 15] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 header than encoding methods that are allocated low probability values. In general the probability should reflect as accurately as possible the chance that the field will be encoded as STATIC. The STATIC encoding has two variants that are useful for packet classification: 6.1.1. STATIC-KNOWN BNF notation: STATIC-KNOWN(,) The STATIC-KNOWN encoding method is a special version of STATIC encoding that can be used when a field always takes one well-known value. The length and integer value of the field are given as parameters after the encoding. The STATIC-KNOWN command does not need a probability parameter as it is automatically used to compress the field with 100% certainty. If the STATIC-KNOWN encoding method is assigned to a field using the "encode" command then the field MUST NOT be assigned any other encoding methods. Note that this is true for all fields that are not given probability parameters. Since the STATIC-KNOWN fields take only one constant value for any packet, the compressor MAY wish to use them for profile selection. It is not possible to compress a certain header using a certain ROHC profile unless the STATIC-KNOWN fields in the header take the values specified in the profile. Conversely, if every STATIC-KNOWN field in the header takes the specified value then it is likely that the ROHC profile can be used to successfully compress the header. More detail on profile selection can be found in Appendix A.1.1. 6.1.2. STATIC-UNKNOWN BNF notation: STATIC-UNKNOWN() The STATIC-UNKNOWN encoding method is a special version of STATIC encoding that can be used when a field always takes one value. Unlike the STATIC-KNOWN encoding method this value is not known a-priori and must be transmitted to the decompressor using an IR packet. For this reason, only the length of the field (not its value) is given as a parameter. Since the STATIC-UNKNOWN fields indicate the flow to which a packet belongs, the compressor MAY wish to use them for context selection. It is not possible to compress a certain header using a certain ROHC context unless the STATIC-UNKNOWN fields in the header take the values specified in the context. Conversely, if every STATIC-UNKNOWN field in the context takes the specified value then it is likely that the ROHC context can be used to successfully compress the header. More detail on context selection can be found in Appendix A.1.1. Price et al. [PAGE 16] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 6.2. IRREGULAR BNF notation: IRREGULAR() The IRREGULAR encoding method is used when the field cannot be compressed relative to the context, and hence must be transmitted in full in the compressed header. The IRREGULAR encoding method is followed by a length parameter to indicate the length of the field in bits. 6.3. VALUE BNF notation: VALUE(,) The VALUE encoding method can be used to transmit one particular value for a field. It is followed by parameters to indicate the length and integer value of the field. Note that the VALUE encoding method is similar to STATIC-KNOWN encoding except that the field does not have to be compressed using VALUE encoding for 100% of the time. Consequently an additional probability parameter is included to give the percentage of time for which the field is expected to be compressed using VALUE encoding. 6.4. LSB BNF notation: LSB(,) The LSB encoding method compresses the field by transmitting only its LSBs (Least Significant Bits). The first parameter indicates the number of LSBs to transmit in the compressed header. The second parameter is the offset of the LSBs: it describes whether the decompressor should interpret the LSBs as increasing or decreasing the field value contained in its context. To illustrate how the second parameter works, suppose that k LSBs are transmitted with offset p. The decompressor uses these LSBs to replace the k LSBs of the field stored in context[current_field], and then adds or subtracts multiples of 2^k so that the new field value lies between context[current_field] - p and context[current_field] - p + 2^k - 1. In particular, if p = 0 then the field value can only stay the same or increase. If p = -1 then it can only increase, whereas if p = 2^k then it can only decrease. Recall that for robustness the compressor can store r values for each field in its context. If this is the case then enough LSBs are transmitted so that the decompressor can reconstruct the correct field value, no matter which of the r values it has stored in its context. This is equivalent to Window-based LSB encoding as described in [ROHC]. Price et al. [PAGE 17] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 A modified version of LSB encoding is given below: 6.4.1. LSB-PADDED BNF notation: LSB-PADDED(,) The LSB-PADDED encoding method compresses any field that is padded with initial zeroes. The encoding transmits a certain number of LSBs (Least Significant Bits) of the field. The first parameter gives the overall length of the field, whilst the next parameter specifies the number of LSBs to be transmitted in the compressed header. The bits not transmitted are all taken to be 0 by the decompressor. The LSB-PADDED encoding method is useful for compressing fields that take small integer values with a high probability. Note that unlike conventional LSB encoding, it is not reliant on the context since the bits not transmitted are always taken to be 0. 6.5. UNCOMPRESSED BNF notation: UNCOMPRESSED(,,,) The UNCOMPRESSED encoding method transmits a field uncompressed without alteration. All uncompressed field are transmitted as-is at the end of the compressed header. The UNCOMPRESSED encoding method differs from the IRREGULAR encoding method in that the size of the field is not fixed, but instead is specified by a control field. The first parameter gives the length n of the control field: UNCOMPRESSED encoding obtains this control field simply by removing the first n bits from the control field heap. The next three parameters specify a divisor, multiplier and offset for the control field. These parameters scale the value of the control field so that it specifies the exact size of the UNCOMPRESSED field in bits. If the parameters are d, m and p respectively then: size of UNCOMPRESSED field = floor(control field value / d) * m + p UNCOMPRESSED encoding is usually used in conjunction with INFERRED encoding, which writes to the control field heap as explained below: 6.6. INFERRED BNF notation: INFERRED(,) The INFERRED encoding method is used to infer the value of a field from other fields in the uncompressed packet. Since INFERRED encoding exploits the redundancy between fields in the uncompressed header, it does not refer to the context and does not require any bits to be transmitted in the compressed packet. Price et al. [PAGE 18] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 The basic version of the INFERRED encoding method infers the field value directly from the control data. The first parameter specifies the number of bits which are copied to the control field heap at the decompressor (and extracted from the control field heap at the decompressor). For example, the Authentication Data in the AH header can be compressed as follows: encode AH_Length as INFERRED(8) ... ... encode Auth_Data as UNCOMPRESSED(8,32,1,64) At the compressor the AH Length field is copied to the control data, where it is used to determine the size of the Auth Data field. This is sent uncompressed at the end of the compressed packet. At the decompressor the Auth Data field is retrieved, and its size is used to infer the value of the AH Length field. The second parameter specifies the position at which the field is inserted into the control data heap. This is useful for controlling the order in which control fields are extracted from the heap. The second parameter can be omitted, in which case it is taken to be 0 (and hence the control field is appended to the front of the heap where it will be the first to be extracted). As the inference rule is often more complicated than simply copying the field value directly, the following modified versions of INFERRED encoding are available: 6.6.1. INFERRED-TRANSLATE BNF notation: INFERRED-TRANSLATE(,, {,} ) The INFERRED-TRANSLATE encoding method infers the field value from a specified field, but additionally translates it under a certain mapping. As with the basic INFERRED encoding, the first two parameters specify the length of the field and the bit offset to which it is copied in the control field heap. These parameters are followed by additional pairs of integers, representing the field value before and after it is translated. For example: encode GRE_Protocol as INFERRED-TRANSLATE(16,4, 41,0x86DD,4,0x0800) The GRE Protocol field behaves in the same manner as the Next Header field in other extension headers, except that it indicates that the subsequent header is IPv6 or IPv4 using the values 0x86DD and 0x0800 instead of 41 and 4. The INFERRED-TRANSLATE encoding method can convert the standard values (as provided by LIST compression defined in Section 6.9) into the values required by the GRE Protocol field. Price et al. [PAGE 19] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 6.6.2. INFERRED-SIZE BNF notation: INFERRED-SIZE(,) The INFERRED-SIZE encoding method infers the value of a field from the size of the uncompressed packet. The first parameter specifies the uncompressed field length in bits, and the second parameter specifies the offset of the uncompressed packet size. If the INFERRED-SIZE field value is v, the offset is p and the total packet length after (but not including) the INFERRED- SIZE field is L then the following equation applies: L = 8 * v + p 6.6.3. INFERRED-OFFSET BNF notation: INFERRED-OFFSET(,) The INFERRED-OFFSET encoding method compresses a field that usually has a constant offset relative to a certain base field. The first parameter describes the length of the field to be compressed, whilst the second parameter is an indicator flag. If the flag is set to 1 then the base field is copied from the control field heap, but if it is set to 0 or omitted then the base field is assumed to be a special control field called the Master Sequence Number (MSN). This field is stored separately from the control field heap, and increases by 1 for each subsequent packet. The encoding subtracts the base field from the field to be compressed, takes the results modulo the field length, and replaces the field value by these "offset" bits in the uncompressed header. The offset bits are then compressed by the next encoding method in the input code. For example, a typical sequence number can be compressed as follows: encode Seq_Number as INFERRED-OFFSET(32) encode Seq_Number.Offset as STATIC 99% C or LSB(8,-1) 0.7% or LSB(16,-1) 0.2% C or IRREGULAR(32) 0.1% In this case the offset field is expected to change rarely and only by small amounts, and hence it is compressed using LSB encoding. Note that the name given to the offset bits does not affect how EPIC- LITE compresses the header, but for readability they are usually given the same name as the original field with an additional ".Offset" suffix. Price et al. [PAGE 20] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 6.6.4. INFERRED-SCALED BNF notation: INFERRED-SCALED() The INFERRED-SCALED encoding method compresses a field that usually increases by a fixed value for consecutive headers. Note that this encoding is a more powerful version of INFERRED-OFFSET encoding, with two additional variables called scale and NBO. Each time the base field increases by 1, the field marked as INFERRED- SCALED increases by the value contained in scale (for INFERRED-OFFSET the scale factor is always 1). Additionally, NBO specifies whether the Network Byte Order of the field should be reversed before the scale factor is added. The precise value of scale and NBO does not affect interoperability (since there is always a suitable value for the offset bits given any choice of scale and NBO), and so the decision on when to change the scale factor and the Network Byte Order is implementation-specific. Once the scale, NBO and offset bits have been determined, they are used to replace the original field in the uncompressed header (and subsequently compressed by the next encoding methods encountered). For example the TCP Sequence Number can be compressed as follows: encode Seq_Number as INFERRED-SCALED(32) encode Seq_Number.Scale as LSB-PADDED(32,0) 99% C or LSB-PADDED(32,16) 0.9% C or LSB-PADDED(32,32) 0.1% encode Seq_Number.NBO as VALUE(1,0) 100% encode Seq_Number.Offset as STATIC 77% C or LSB(11,-1) 22% C or LSB(16,-2049) 0.9% C or IRREGULAR(32) 0.1% 6.6.5. INFERRED-IP-CHECKSUM BNF notation: INFERRED-IP-CHECKSUM The INFERRED-IP-CHECKSUM encoding method compresses the IPv4 checksum only. The field value is discarded at the compressor and recalculated at the decompressor. Note that this inference rule can be safely applied to the IPv4 checksum field since it is already recalculated on a per-hop basis. However a similar rule SHOULD NOT be used to infer the TCP checksum as this would violate the end-to-end integrity of the TCP packet. In the input code, the INFERRED-IP-CHECKSUM encoding method MUST be applied before any other fields in the IPv4 header are compressed. This ensures that it will be decompressed last, when the inference rule can be applied successfully since the IPv4 header is available. Price et al. [PAGE 21] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 6.6.6. INFERRED-ESP-PADDING BNF notation: INFERRED-ESP-PADDING() The INFERRED-PADDING encoding method infers the value of the ESP padding field when it takes the form of an increasing sequence of integers. The parameter specifies another encoding method that is used to compress the remainder of the ESP trailer. 6.7. OPTIONAL BNF notation: OPTIONAL() The OPTIONAL encoding method is used to compress fields that are optionally present in the uncompressed header. OPTIONAL encoding requires a 1 bit indicator flag to specify whether or not the optional field is present in the uncompressed header. This flag is extracted from the control field heap. The value of the flag is appended to the heap by another encoding method (such as INFERRED or LIST). For example: encode GRE_Key_Flag as INFERRED(1,2) encode GRE_Key as OPTIONAL(KEY-ENCODING) In this case the encoding method KEY-ENCODING is called to compress the GRE Key field, but only if the Key Flag is set to 1. If the Key Flag is set to 0 (indicating that the GRE Key is not present) then no action is taken. 6.8. CONTEXT BNF notation: CONTEXT(,) The CONTEXT encoding method is used to store multiple copies of the same field in the context. This encoding method is useful when compressing fields that take a small number of values with high probability, but when these values are not known a-priori. CONTEXT encoding can also be applied to larger fields: even the entire TCP header. This can be very useful when multiple TCP flows are sent to the same IP address, as a single [ROHC] context can be used to compress the packets in all of the TCP flows. The first parameter specifies the encoding method that should be used to compress the field. The second parameter specifies how many copies of the field should be stored in the context. CONTEXT encoding applies the specified encoding method to the uncompressed header, compressing relative to any of the copies of the field stored in its context. It then appends an "index" value to the Price et al. [PAGE 22] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 uncompressed header to indicate to the decompressor which context value should be used for decompression. Consider the following example using the TCP Window field: encode Window as CONTEXT(TCP-WINDOW,4) encode Window.Index as VALUE(2,0) 89% or VALUE(2,1) 10% or VALUE(2,2) 0.5% or VALUE(2,3) 0.5% At most 4 copies of the Window field can be stored in the context. The Window field can be compressed relative to any of these values: the value chosen by the compressor is transmitted to the decompressor using the "index". 6.9. LIST BNF notation:LIST(,,,{,}) The LIST encoding method compresses a list of items that do not necessarily occur in the same order for every uncompressed header. Example applications for the LIST encoding method include TCP options and TCP SACK blocks. The maximum size of the list is determined by a control field in exactly the same manner as for UNCOMPRESSED encoding. The first four integer parameters are defined as in UNCOMPRESSED. These parameters are followed by a set of encoding methods that can be used to compress individual items in the list. In general LIST tries each encoding method in turn until it finds one that successfully compresses the next item in the list. If an item is encountered that cannot be compressed by any of the encoding methods provided, and if the maximum list size has not yet been reached, then LIST encoding fails. Once the maximum list size is reached, LIST encoding appends the order in which the encoding methods were applied to the uncompressed data. For example: encode TCP_Options as LIST(4,1,32,0, OPTIONAL(TCP-SACK), OPTIONAL(TCP-TIMESTAMP), OPTIONAL(TCP-END), OPTIONAL(TCP-GENERIC)) encode TCP_Options.Order as STATIC 50% C or IRREGULAR(5) 50% D Note that number of "order" bits appended to the uncompressed data is equal to ceiling(log2(k * (k - 1) * (k - 2) * ... * 1)) where k is the number of encoding methods provided. This information carries to the decompressor the order in which to reconstruct the fields. Price et al. [PAGE 23] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 6.9.1. LIST-NEXT BNF notation: LIST-NEXT(,{,}{,}) LIST-NEXT encoding is similar to basic LIST encoding, except that the next list item to compress is known a-priori from a control field. IP extension headers can be compressed using LIST-NEXT. The first parameter specifies the number of bits to extract from the control field heap before each list item is compressed. This is followed by the set of encoding methods available to LIST-NEXT and a set of 0 or more integers. The nth encoding method can only be called when the nth integer value is obtained from the control field heap. For example: encode Header_Chain as LIST-NEXT(8, AH-ENCODING, ESP-ENCODING, GRE-ENCODING, GENERIC-ENCODING, 51,50,47) encode Header_Chain.Order as STATIC 50% C or IRREGULAR(5) 50% D The IP extension header chain has a number of specific encoding methods designed for one type of extension header (AH, ESP or GRE) as well as a "generic" encoding method that can cope with arbitrary extension headers. Just as with basic LIST encoding, LIST-NEXT also appends the order in which the encoding methods are applied to the uncompressed header, so that the decompressor can reconstruct the list in the correct order. 6.10. Flag encoding methods The flag encoding methods are used to modify the behavior of another encoding method. Each flag encoding has a single parameter, which is the name of another encoding method. The flag encoding calls this encoding method, but additionally modifies the input or output in some manner. Note that flag encoding methods do not require the original encoding method to be rewritten (as they only modify its input or output). 6.10.1. C flag BNF notation: C The C flag is used to make encoding methods available to a CO packet only. In the IR and IR-DYN packets, any encoding method marked with a C flag is ignored. Price et al. [PAGE 24] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 An example of the C flag in action is given below: encode Hop_Limit as STATIC 99% C or IRREGULAR(8) 1% In the CO packets the Hop Limit field has a 99% probability of remaining static and a 1% probability of changing. However, in the IR and IR-DYN packets the URG Flag is treated as IRREGULAR and always transmitted in full. 6.10.2. D flag BNF notation: D The D flag is used to make encoding methods available to an IR(-DYN) packet only. In the CO packets, any encoding method marked with a D flag is ignored. 6.10.3. N flag BNF notation: N The N flag runs the encoding method specified by its parameter, with the exception that it does not update the context. This is useful when a field takes an unexpected value for one header and then reverts back to its original behavior in subsequent headers. An example of the N flag in use is given below: encode Window as STATIC 99% C or LSB(11,2048) 0.9% N C or IRREGULAR(16) 0.1% In the above example the N flag is applied to the TCP Window field. The field is compressed by transmitting only the last few LSBs, which are always interpreted at the decompressor as a decrease in the field value. However, because the context is not updated the field reverts back to its original value following the decrease. This reflects the behavior of the TCP Window, which usually takes a constant value (the maximum window size) but occasionally takes a value slightly lower than this when packets are queued at the receiver. 6.11. FORMAT BNF notation: FORMAT(, {} ) The FORMAT encoding method is used to create more than one set of compressed header formats. Recall that each set of compressed header formats uses up all of the indicator bit patterns available at the start of the compressed header. Thus a profile can have several sets of compressed header formats, but only one set can be in use at a given time. Price et al. [PAGE 25] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 FORMAT encoding is followed by a list of k encoding methods. Each encoding method is given its own set of compressed header formats in the CO packets. Note however that all encoding methods are present in the IR(-DYN) packets, so an IR(-DYN) packet may be sent to change to a new set of compressed header formats. An index flag is appended to the uncompressed header to indicate which set of formats is currently in use, as illustrated by the following example: encode IP_ID as FORMAT(SEQUENTIAL-IP-ID, RANDOM-IP-ID) encode IP_ID.Index IRREGULAR(1) 100% D Two sets of compressed header formats are provided: one for an IP ID that increases sequentially, and one for a randomly behaving IP ID. Note that the Index flag is only sent in the IR(-DYN) packets. 6.12. CRC BNF notation: CRC() The CRC encoding method generates a CRC checksum calculated across the entire uncompressed header. At the decompressor this CRC is used to validate that correct decompression has occurred. Note that it is possible for different header formats to have different amounts of CRC protection, so extra CRC bits can be allocated to protect important context-updating information. This is illustrated in the example below: encode Checksum_Coverage as CRC(3) 99% or CRC(7) 1% The uncompressed header is recorded in the crc_static and crc_dynamic variables. Note that the fields encoded as STATIC-KNOWN or STATIC- UNKNOWN are placed in crc_static, and the remaining fields in crc_dynamic. The CRC is calculated over crc_static + crc_dynamic, with the static fields placed first to speed up computation. In general an EPIC-LITE profile can use any CRC length for which a CRC polynomial has been explicitly defined. The following CRC lengths are currently supported: 3-bit: C(x) = 1 + x + x^3 6-bit: C(x) = 1 + x + x^3 + x^4 + x^6 7-bit: C(x) = 1 + x + x^2 + x^3 + x^6 + x^7 8-bit: C(x) = 1 + x + x^2 + x^8 10-bit: C(x) = 1 + x + x^4 + x^5 + x^9 + x^10 12-bit: C(x) = 1 + x + x^2 + x^3 + x^11 + x^12 16-bit: C(x) = 1 + x^2 + x^15 + x^16 Price et al. [PAGE 26] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 7. Creating a new ROHC profile This chapter describes how to generate new [ROHC] profiles using EPIC-LITE. It is important that the profiles are specified in an unambiguous manner so that any compressor and decompressor using the profiles will be able to interoperate. The following eight variables are required by EPIC-LITE to create a new [ROHC] profile: profile_identifier max_formats max_sets bit_alignment npatterns CO_packet IR-DYN_packet IR_packet Once a value has been assigned to each variable the profile is well- defined. A compressor and decompressor using the same values for each variable should be able to successfully interoperate. Each of the variables is described in more detail below: 7.1. Profile identifier The profile_identifier is a 16-bit integer that is used when negotiating a common set of profiles between the compressor and decompressor. Official profile identifiers are assigned by IANA to ensure that two distinct profiles do not receive the same profile identifier. Note that the 8 MSBs of the profile identifier are used to specify the version of the profile (so that old profiles can be obsoleted by new profiles). 7.2. Maximum number of header formats The max_formats parameter controls the number of compressed header formats to be stored at the compressor and decompressor. If more compressed header formats are generated than can be stored then EPIC-LITE discards all but the max_formats most probable formats to be used. Note that the max_formats parameter affects the EPIC-LITE compressed header formats, and so for interoperability it MUST be specified as part of the profile. In a similar manner the max_sets parameter controls the total number of sets of compressed header formats to be stored. Recall that a profile can have several sets of compressed header formats, but only one set may be in use at a given time. It is important to note that the maximum size specified by max_formats applies to each individual set of header formats, so the total overall number of formats that need to be stored is equal to max_formats * (max_sets + 2), including the 2 sets of formats for the IR and IR-DYN packets. Price et al. [PAGE 27] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 7.3. Control of header alignment The alignment of the compressed headers is controlled using the bit_alignment parameter. All of the compressed headers produced by EPIC-LITE are guaranteed to be an integer multiple of bit_alignment bits long. Additionally, the parameter npatterns can be used to reserve bit patterns in the compressed header. The parameter specifies the number of bit patterns in the first word (i.e. the first bit_alignment bits) of the compressed header that are available for use by EPIC-LITE. Consequently npatterns takes a value between 1 and 2^bit_alignment. For compatibility with [ROHC], it is important for EPIC-LITE not to use the bit patterns 111XXXXX in the first octet of each compressed header because they are reserved by the ROHC framework. So to produce a set of header formats compatible with [ROHC] the bit_alignment parameter MUST be set to 8 and npatterns MUST be set to 224. 7.4. Compressed header formats The profile parameter CO_packet specifies an encoding method that is used to generate the EPIC-LITE CO packets. This encoding method may be described using the input language provided in Chapter 4 (or in fact can be described in any manner provided that it is unambiguous). The distinction between the eight variables required to define a new [ROHC] profile and the input language defined in Chapter 4 is important. The only requirement for compatibility with the profile is that the correct compressed header formats are used: the fact that they are specified in the input language is not important, and they can be implemented in any manner. The profile parameters IR_packet and IR-DYN_packet specify an encoding method which is used to generate the EPIC-LITE IR and IR-DYN packets respectively. Note that the IR-DYN_packet parameter is optional. If it is not given then EPIC-LITE generates the IR-DYN packet using the same encoding method as specified by the CO_packet parameter. The IR_packet parameter is also optional. If it is not given then EPIC-LITE generates the IR packet using the same encoding method as specified by the IR-DYN_packet parameter (or CO_packet if IR-DYN_packet is also not given). 8. EPIC-LITE state machine EPIC-LITE currently runs in a unidirectional mode of operation, with packets sent from the compressor to the decompressor only. This means that EPIC-LITE is usable over links where a return path from the decompressor to compressor is unavailable or undesirable. Transitions between compressor states are performed only on account of periodic timeouts and irregularities in the uncompressed packet stream. Price et al. [PAGE 28] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 8.1. Compression and decompression states Header compression with EPIC-LITE can be characterized as an interaction between two state machines (one compressor machine and one decompressor machine) each instantiated once per context. The compressor and decompressor each have three states of operation. Both machines start in the lowest compression state and transit gradually to higher states. Transitions need not be synchronized between the two machines. In normal operation it is only the compressor that temporarily transits back to lower states. The decompressor will not transit back unless context damage is detected. Subsequent sections present an overview of the state machines and their corresponding states, respectively, starting with the compressor. 8.2. Compressor states For EPIC-LITE compression the three compressor states are the Initialization and Refresh (IR), Dynamic Initialization and Refresh (IR-DYN) and Compressed (CO) states. The compressor starts in the lowest compression state (IR) and transits gradually to higher compression states. The compressor will always operate in the highest compression state if possible, under the constraint that the compressor is sufficiently confident that the decompressor has the information necessary to decompress a header compressed according to that state. +----------+ +--------------+ +----------+ | IR State | <--------> | IR-DYN State | <--------> | CO State | +----------+ +--------------+ +----------+ Note that the three compressor states correspond precisely to the three types of packet available to EPIC-LITE. The state machine dictates which packet format can be used to compress a header (CO packets in the CO state, IR-DYN packets in the IR-DYN state and IR packets in the IR state). Decisions about transitions between the compression states are taken by the compressor on the basis of: - variations in packet headers - periodic timeouts How transitions are performed is explained in detail in Appendix A.1.2. 8.2.1. Initialization and Refresh (IR) State The purpose of the IR state is to initialize the static parts of the context at the decompressor or to recover after failure. In this state the compressor sends IR packets only. This packet type includes all static and changing fields in uncompressed form plus any additional information needed. The compressor stays in the IR state Price et al. [PAGE 29] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 until it is confident that the decompressor has received the static information correctly. 8.2.2. Dynamic Initialization and Refresh (IR-DYN) State The purpose of the IR-DYN state is to initialize or refresh any part of the context except for the fields which remain constant throughout the lifetime of the context. In this state the compressor sends IR- DYN packets only. This packet type includes all changing fields in uncompressed form plus any additional information needed. The compressor stays in the IR-DYN state until it is confident that the decompressor has received the dynamic information correctly. 8.2.3. Compressed (CO) State The purpose of the CO state is to efficiently communicate irregularities in the packet stream. When operating in this state, the compressor rarely sends information about all fields, and the information sent is usually compressed at least partially. The compressor enters this state from the IR or IR-DYN state and stays until the header no longer conforms to the uniform pattern and cannot be independently compressed on the basis of previous context information. Some or all packets sent in the CO state carry context updating information. It is very important to detect corruption of such packets to avoid erroneous updates and context inconsistencies. This is accomplished at the decompressor as explained in Section 8.3: 8.3. Decompressor states The decompressor starts in its lowest state, "No Context" and transits to higher states. The decompressor state machine normally never leaves the "Full Context" state once it has entered this state. +--------------+ +----------------+ +--------------+ | No Context | <---> | Static Context | <---> | Full Context | +--------------+ +----------------+ +--------------+ Initially, while working in the "No Context" state, the decompressor has not yet successfully decompressed a packet. Once a packet has been decompressed correctly (for example, upon reception of an IR packet) the decompressor can transit all the way to the "Full Context" state, and only upon repeated failures will it transit back to lower states. However, when that happens it first transits back to the "Static Context" state. There reception of any packet sent with enough CRC protection is normally sufficient to enable transition to the "Full Context" state again. Only when decompression of several packets fails in the "Static Context" state will the decompressor go all the way back to the "No Context" state. How transitions are performed is explained in detail in Appendix A.2.2. Price et al. [PAGE 30] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 9. ROHC Profile for compression of TCP/IP This chapter describes a ROHC profile for the compression of TCP/IP. Note that the probabilities listed for each encoding method are initial estimates only. These need to be refined with more accurate values from genuine TCP/IP streams. The profile for TCP/IP compression is given below: profile_identifier 0xFFFF max_formats 500 max_sets 6 bit_alignment 8 npatterns 224 CO_packet TCP/IP-ENCODING # Comments preceded with a "#" are ignored in the input language. # The profile identifier is a placeholder. # The IR-DYN_packet and IR_packet parameters are not specified. This # means that the IR-DYN and IR packets are generated using the same # encoding method TCP/IP-ENCODING as for the CO packets. # The encoding methods used by the TCP/IP profile are given below: method TCP/IP-ENCODING encode Base_IP_Header as FORMAT(IPv6-ENCODING, IPv4-ENCODING) encode Base_IP_Header.Index as IRREGULAR(1) 100% D # The profile constructs a separate set of compressed header formats # for IPv6 and for IPv4. Recall that the "index" field determines # which set of header formats is currently in use. This field can be # updated by sending an IR or IR-DYN packet. encode Header_Chain as LIST-NEXT(8, OPTIONAL(AH-ENCODING), OPTIONAL(ESP-ENCODING), OPTIONAL(GRE-ENCODING), OPTIONAL(IPv6-ENCODING), OPTIONAL(IPv4-ENCODING), OPTIONAL(AH-ENCODING), OPTIONAL(ESP-ENCODING), OPTIONAL(GRE-ENCODING), TCP-ENCODING, OPTIONAL(GENERIC-ENCODING), OPTIONAL(GENERIC-ENCODING), 51,50,47,41,4,51,50,47,6) # This is a lot simpler than it first appears. Honest! Price et al. [PAGE 31] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 # The header chain following the base IP header can contain IPv4, # IPv6, AH, ESP, GRE and TCP headers, as well as generic IP extension # headers for which a dedicated encoding method is not provided. All # headers are optional except for TCP. The integer values following # the available encoding methods are used to pick which encoding is # called next: when the "Next Header" control field takes the nth # integer value then the nth encoding method is called. encode Extension_Hdrs.Order as STATIC 100% C or IRREGULAR(26) 100% D encode Checksum_Coverage as CRC(3) 99.9% C or CRC(7) 0.1% C # A 3-bit CRC and a 7-bit CRC are available in the CO packets # depending on how much robustness is required. Note that an 8-bit # CRC for the IR and IR-DYN packet is provided by the ROHC framework. encode MSN as LSB(4,0) 99% C or LSB(7,112) 0.9% N C or IRREGULAR(16) 0.1% end_method method IPv6-ENCODING encode Version as STATIC-KNOWN(4,6) encode Traffic_Class as STATIC 99.9% C or IRREGULAR(6) 0.1% encode ECT_Flag as STATIC 99.9% C or IRREGULAR(1) 0.1% encode CE_Flag as VALUE(1,0) 99% C or VALUE(1,1) 1% # The profile is designed to efficiently compress TCP stacks using # ECN. However, legacy stacks where ECN is not used are also covered. encode Flow_Label as STATIC-UNKNOWN(20) encode Payload_Length as INFERRED-SIZE(16,288) encode Next_Header as INFERRED(8) encode Hop_Limit as STATIC 99% C or IRREGULAR(8) 1% encode Source_Address as STATIC-UNKNOWN(128) encode Destination_Address as STATIC-UNKNOWN(128) end_method Price et al. [PAGE 32] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 method IPv4-ENCODING encode Checksum as INFERRED-IP-CHECKSUM # The checksum is compressed first as explained in Section 6.6.5. encode Version as STATIC-KNOWN(4,4) encode Header_Length as STATIC-KNOWN(4,5) encode TOS as STATIC 99.9% C or IRREGULAR(6) 0.1% encode ECT_Flag as STATIC 99.9% C or IRREGULAR(1) 0.1% encode CE_Flag as VALUE(1,0) 99% C or VALUE(1,1) 1% encode Total_Length as INFERRED-SIZE(16,-16) encode IP_ID as FORMAT(SEQUENTIAL-IP-ID, RANDOM-IP-ID) encode IP_ID.Index as IRREGULAR(1) 100% D # The profile contains a separate set of compressed header formats # for a sequential IP ID and for a random IP ID. encode Reserved_Flag as STATIC-KNOWN(1,0) encode DF_Flag as STATIC 99.9% C or IRREGULAR(1) 0.1% encode MF_Flag as STATIC-UNKNOWN(1) encode Fragment_Offset as STATIC-KNOWN(13,0) encode TTL as STATIC 99% C or IRREGULAR(8) 1% encode Protocol as INFERRED(8) encode Source_Address as STATIC-UNKNOWN(32) encode Destination_Address as STATIC-UNKNOWN(32) end_method # The Header Length field is not static when IP options are used. The # profile could be extended to handle this possibility if needed. # # The profile could also be extended to handle packet fragmentation # if required. Price et al. [PAGE 33] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 method SEQUENTIAL-IP-ID encode IP_ID as INFERRED-SCALED(16) encode IP_ID.Scale as VALUE(16,1) 100% # Note that the scale field always takes the value 1; however it # should not be classified as STATIC-KNOWN because it is made up by # the compressor and cannot be used for packet classification. encode IP_ID.NBO as STATIC 99.9% C or IRREGULAR(1) 0.1% encode IP_ID.Offset as STATIC 99% C or LSB(5,-1) 0.7% C or LSB(8,-33) 0.2% C or IRREGULAR(16) 0.1% end_method method RANDOM-IP-ID encode IP-ID as IRREGULAR(16) 100% end_method method TCP-ENCODING encode Present as VALUE(1,1) 100% # In the header chain the TCP header is mandatory, and so the # "present" control field always takes the value 1. encode TCP_Header as CONTEXT(TCP-HEADER,4) encode TCP_Header.Index as VALUE(2,0) 75% or VALUE(2,1) 20% or VALUE(2,2) 4% or VALUE(2,3) 1% # A total of 4 copies of the TCP header are stored in the context. # This is so that multiple TCP flows are able to share the same IP # context, which improves the compression efficiency particularly in # the case of short-lived TCP flows. end_method method TCP-HEADER encode Source_Port as STATIC-UNKNOWN(16) encode Destination_Port as STATIC-UNKNOWN(16) encode Seq_Number as INFERRED-SCALED(32) Price et al. [PAGE 34] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 encode Seq_Number.Scale as LSB-PADDED(32,0) 99% C or LSB-PADDED(32,16) 0.9% C or LSB-PADDED(32,32) 0.1% encode Seq_Number.NBO as VALUE(1,0) 100% encode Seq_Number.Offset as STATIC 77% C or LSB(11,-1) 22% C or LSB(16,-2049) 0.9% C or IRREGULAR(32) 0.1% encode Ack_Number as INFERRED-SCALED(32) encode Ack_Number.Scale as LSB-PADDED(32,0) 99% C or LSB-PADDED(32,16) 0.9% C or LSB-PADDED(32,32) 0.1% encode Ack_Number.NBO as VALUE(1,0) 100% encode Ack_Number.Offset as STATIC 77% C or LSB(11,-1) 22% C or LSB(16,-2049) 0.9% C or IRREGULAR(32) 0.1% encode Data_Offset as INFERRED(4) encode Reserved as STATIC-KNOWN(4,0) encode CWR_Flag as VALUE(1,0) 99.9% C or VALUE(1,1) 0.1% C or IRREGULAR(1) 100% D encode ECN_E_Flag as STATIC 99% C or IRREGULAR(1) 1% encode URG_Flag as VALUE(1,0) 99.9% C or VALUE(1,1) 0.1% C or IRREGULAR(1) 100% D encode ACK_Flag as VALUE(1,1) 99.9% C or VALUE(1,0) 0.1% C or IRREGULAR(1) 100% D encode PSH_Flag as STATIC 90% C or IRREGULAR(1) 10% encode RST_Flag as VALUE(1,0) 99.9% C or VALUE(1,1) 0.1% C or IRREGULAR(1) 100% D encode SYN_Flag as VALUE(1,0) 99.9% C or VALUE(1,1) 0.1% C or IRREGULAR(1) 100% D Price et al. [PAGE 35] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 encode FIN_Flag as VALUE(1,0) 95% C or VALUE(1,1) 5% C or IRREGULAR(1) 100% D # Most flags are sent explicitly in the IR(-DYN) packets to reduce # the number of compressed header formats required. encode Window as CONTEXT(TCP-WINDOW,4) encode Window.Index as VALUE(2,0) 89% or VALUE(2,1) 10% or VALUE(2,2) 0.5% or VALUE(2,3) 0.5% encode Checksum as IRREGULAR(16) 100% encode Urgent_Pointer as STATIC 99.9% C or IRREGULAR(16) 0.1% encode TCP_Options as LIST(4,1,32,0, OPTIONAL(TCP-SACK), OPTIONAL(TCP-TIMESTAMP), OPTIONAL(TCP-END), OPTIONAL(TCP-GENERIC)) encode TCP_Options.Order as STATIC 50% C or IRREGULAR(5) 50% D end_method method TCP-WINDOW encode Window as STATIC 99% C or LSB(11,2048) 0.9% N C or IRREGULAR(16) 0.1% end_method method TCP-SACK encode Kind as STATIC-KNOWN(8,5) encode SACK_Length as INFERRED(8) encode Edge as LIST(8,1,8,0, BLOCK,BLOCK,BLOCK,BLOCK) # Each SACK option contains at most 4 SACK blocks. These blocks are # compressed using LIST encoding with the overall size of the list # specified by the SACK Length field. encode Edge.Order as VALUE(5,0) 100% end_method Price et al. [PAGE 36] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 method BLOCK encode SACK_Block as VALUE(1,0) 50% or BLOCK-PRESENT 50% # It is assumed that the number of SACK blocks changes on a per- # header basis, and so the Present field that indicates whether a # block is present or absent is sent explicitly in every case # (instead of using OPTIONAL encoding). end_method method BLOCK-PRESENT encode Present as VALUE(1,1) 100% encode Left_Edge as INFERRED(32) encode Right_Edge as INFERRED-OFFSET(32,1) encode Base as LSB-PADDED(32,8) 80% C or LSB-PADDED(32,20) 19.9% C or LSB-PADDED(32,32) 0.1% encode Right_Edge.Offset as LSB-PADDED(32,8) 90% C or LSB-PADDED(32,20) 9.9% C or LSB-PADDED(32,32) 0.1% # The Edge fields in each SACK block are compressed using offset # encoding. The Left Edge field is taken as the base of the offset # and is transmitted using LSB encoding. Furthermore, the LSBs of # (Right Edge - Left Edge) are also transmitted so that the Right # Edge field can be inferred. end_method method TCP-TIMESTAMP encode Kind as STATIC-KNOWN(8,8) encode TS_Length as STATIC-KNOWN(8,10) encode TS_Value as LSB(8,-1) 90% C or LSB(16,-1) 9% C or LSB(24,-1) 0.9% C or IRREGULAR(32) 0.1% encode TS_Echo_Reply as LSB(8,-1) 90% C or LSB(16,-1) 9% C or LSB(24,-1) 0.9% C or IRREGULAR(32) 0.1% end_method Price et al. [PAGE 37] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 method TCP-END encode Kind as STATIC-KNOWN(8,0) end_method method TCP-GENERIC encode Option_Body as STATIC 80% C or NEW-TCP-OPTION 20% end_method method NEW-TCP-OPTION encode Kind as IRREGULAR(8) 100% encode Length as INFERRED(8) encode Data as UNCOMPRESSED(8,1,8,64) encode Data.Length as IRREGULAR(8) 100% end_method # Encoding methods have now been provided for TCP/IP compression # including TCP options. IP extension headers follow. method AH-ENCODING encode Next_Header as INFERRED(8) encode AH_Length as INFERRED(8) encode Reserved as STATIC-KNOWN(16,0) encode SPI as STATIC-UNKNOWN(32) encode Sequence_Number as SEQ-NUM encode Auth_Data as UNCOMPRESSED(8,32,1,64) encode Auth_Data.Length as STATIC 99% or LSB-PADDED(8,3) 0.9% or IRREGULAR(8) 0.1% # The amount of authentication data varies but is usually 256 octets # or less. Thus it is often more efficient to transmit the length # field as LSBs padded with zeroes rather than sending it in full. end_method Price et al. [PAGE 38] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 method ESP-ENCODING encode SPI as STATIC-UNKNOWN(32) encode Sequence_Number as SEQ-NUM encode ESP_Trailer as INFERRED-ESP-PADDING(TRAILER) end_method method TRAILER encode Pad_Length as STATIC 90% or IRREGULAR(8) 10% encode Next_Header as INFERRED(8) encode Auth_Data as IRREGULAR(96) 100% end_method method GRE-ENCODING encode C_R_Flags as INFERRED(2) encode Key_Flag as INFERRED(1,2) encode SN_Flag as INFERRED(1,3) encode Strict_Source_Route as STATIC 99.9% C or IRREGULAR(1) 0.1% encode Recursion_Ctrl as STATIC 99.9% C or IRREGULAR(3) 0.1% encode Flags as STATIC-KNOWN(5,0) encode Version as STATIC-KNOWN(3,0) encode Protocol as INFERRED-TRANSLATE(16,4, 41,0x86DD,4,0x0800) encode Checksum as UNCOMPRESSED(2,3,4,16) encode Offset as OFFSET-STATIC 99.8% C or VALUE(2,0) 0.1% D or OFFSET-PRESENT 0.1% D encode Key as OPTIONAL(KEY-ENCODING) encode Seq_Num as OPTIONAL(SEQ-NUM) encode SSR_Entry as LIST(0,0,0,0, NULL-SRE, Price et al. [PAGE 39] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 OPTIONAL(NORMAL-SRE), OPTIONAL(NORMAL-SRE)) end_method method OFFSET-STATIC encode Checksum.Length as STATIC 100% encode Offset as STATIC 99.9% C or IRREGULAR(16) 0.1% end_method method OFFSET-PRESENT encode Checksum.Length as IRREGULAR(2) 100% encode Offset as IRREGULAR(16) 100% end_method method KEY-ENCODING encode Key as STATIC 99.9% C or IRREGULAR(32) 0.1% end_method method NULL-SRE encode Address_Family as STATIC-KNOWN(16,0) encode Data_Offset as STATIC 99.9% C or IRREGULAR(8) 0.1% encode Len as STATIC-KNOWN(8,0) encode Present as VALUE(1,1) 100% end_method method NORMAL-SRE encode Address_Family as STATIC 99.9% C or IRREGULAR(16) 0.1% encode Data_Offset as STATIC 99.9% C or IRREGULAR(8) 0.1% encode Len as INFERRED(8) encode Rt_Info as UNCOMPRESSED(8,1,1,8) Price et al. [PAGE 40] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 encode Rt_Info.Length as STATIC 99% C or IRREGULAR(8) 1% end_method method GENERIC-ENCODING encode Next_Header as INFERRED(8) encode Header as STATIC 80% C or NEW-ITEM 20% end_method method NEW-ITEM encode Item_Length as INFERRED(8) encode Data as UNCOMPRESSED(8,1,8,64) encode Data.Length as IRREGULAR(8) 100% end_method method SEQ-NUM encode Seq_Number as INFERRED-OFFSET(32) encode Seq_Number.Offset as STATIC 99% C or LSB(8,-1) 0.7% or LSB(16,-33) 0.2% C or IRREGULAR(32) 0.1% end_method 10. Security considerations EPIC-LITE generates compressed header formats for direct use in ROHC profiles. Consequently the security considerations for EPIC-LITE match those of [ROHC]. 11. Acknowledgements Header compression schemes from [ROHC] have been important sources of ideas and knowledge. Basic Huffman encoding [HUFF] was enhanced for the specific tasks of robust, efficient header compression. Thanks to Carsten Bormann (cabo@tzi.org) Christian Schmidt (christian.schmidt@icn.siemens.de) Max Riegel (maximilian.riegel@icn.siemens.de) for valuable input and review. Price et al. [PAGE 41] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 12. References [ROHC] "RObust Header Compression (ROHC)", Carsten Bormann et al, RFC3095, Internet Engineering Task Force, July 2001 [EPIC] "Enhanced TCP/IP Compression for ROHC", Richard Price et al, , Internet Engineering Task Force, July 9, 2001 [HUFF] "The Data Compression Book", Mark Nelson and Jean-Loup Gailly, M&T Books, 1995 [RFC-1144] "Compressing TCP/IP Headers for Low-Speed Serial Links", V. Jacobson, Internet Engineering Task Force, February 1990 [RFC-1951] "DEFLATE Compressed Data Format Specification version 1.3", P. Deutsch, Internet Engineering Task Force, May 1996 [RFC-2026] "The Internet Standards Process - Revision 3", Scott Bradner, Internet Engineering Task Force, October 1996 [RFC-2119] "Key words for use in RFCs to Indicate Requirement Levels", Scott Bradner, Internet Engineering Task Force, March 1997 13. Authors' addresses Richard Price Tel: +44 1794 833681 Email: richard.price@roke.co.uk Robert Hancock Tel: +44 1794 833601 Email: robert.hancock@roke.co.uk Stephen McCann Tel: +44 1794 833341 Email: stephen.mccann@roke.co.uk Mark A West Tel: +44 1794 833311 Email: mark.a.west@roke.co.uk Abigail Surtees Tel: +44 1794 833131 Email: abigail.surtees@roke.co.uk Paul Ollis Tel: +44 1794 833168 Email: paul.ollis@roke.co.uk Roke Manor Research Ltd Romsey, Hants, SO51 0ZN United Kingdom Price et al. [PAGE 42] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 Appendix A. EPIC-LITE compressor and decompressor This appendix gives a complete pseudocode description of the EPIC- LITE compressor and decompressor. The appendix contains the following sections: Compressor operation (Section A.1) Decompressor operation (Section A.2) Offline processing (Section A.3) Library of encoding methods (Section A.4) BNF description of input language (Section A.5) Recall that each EPIC-LITE profile for [ROHC] is described by the following eight variables: profile_identifier 16-bit integer uniquely identifying the ROHC profile generated by EPIC-LITE max_formats Maximum number of header formats per set max_sets Total number of sets of header formats bit_alignment Number of bits for alignment (all compressed headers will be multiples of bit_alignment bits) npatterns Number of bit patterns available for EPIC-LITE in the first word of the compressed header (set to 224 for compatibility with [ROHC]) CO_packet Name of the encoding method that generates the CO packet formats IR-DYN_packet Name of the encoding method that generates the IR-DYN packet formats IR_packet Name of the encoding method that generates the IR packet formats Additionally, one or more encoding methods may be provided using the special input language defined in Chapter 4. The input code for all of the encoding methods (concatenated in any order) should be parsed and the following relevant information should be extracted: encoding_name[i] The name of the ith encoding method referenced by the "encode" commands in the input code encoding_name[i].forwards A pointer to the first encoding method for the field after encoding_name[i] (or 0 if "end_method" is reached) Price et al. [PAGE 43] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 encoding_name[i].backwards A pointer to the first encoding method for the previous field (or 0 if "method" is reached) encoding_name[i].no_more Set to true if no more encoding methods are available to compress the field after encoding_name[i] is tried encoding_name[i].prob Contains the probability value specified in the encoding method name (or 100% if none is specified) The following two functions are also required: locate-first(encoding_name) For any encoding_name defined using the "method" command, returns a pointer to the first encoding method following the "method" command locate-last(encoding_name) For any encoding_name defined using the "method" command, returns a pointer to the encoding method immediately preceding "end_method" Note that for each field, the order in which the available encoding methods are tested is implementation-specific. Authors of input code SHOULD arrange the encoding methods in the same order as they would recommend them to be tested; authors of a compressor can follow this recommendation or not as they see fit. In addition, the following variables are required by the compression and decompression procedures. Note that all of these variables are global between every procedure at the compressor, and between every procedure at the decompressor. uncompressed_data A bit string containing the entire uncompressed packet including the payload data compressed_data A bit string containing the compressed value of each field control_data A bit string containing the uncompressed control fields current_field A counter keeping track of the current field (begins at 0 and increases by 1 for every new field encountered) hdr[current_field] A bit string for the current field in the uncompressed header context[current_field, i] The ith copy of the current field stored in the context at the compressor Price et al. [PAGE 44] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 context[current_field] A bit string for the current field stored in the context at the decompressor chosen_encoding[curr_field] The encoding method chosen to compress the current field compressor_state The current state at the compressor (can be set to "IR", "IR-DYN" or "CO") current_set The set of compressed header formats currently in use crc_static The static part of the header crc_dynamic The dynamic part of the header MSN The Master Sequence Number A.1. Compressor This section describes the EPIC-LITE header compressor. A.1.1. Step 1: Packet classification The input to the EPIC-LITE compressor is simply an uncompressed packet. The compressor does not know whether the packet contains an RTP header, TCP header or other type of header, and hence the first step is to determine which (if any) ROHC context can be used to compress the packet. With any profile generated by EPIC-LITE the packet classification is performed automatically, since the profile will reject any packet that it cannot successfully compress relative to the chosen context. Note however that additional packet classification MAY be performed before the packet is passed to the EPIC-LITE compressor. For example the compressor MAY wish to check that the STATIC-KNOWN and the STATIC-UNKNOWN fields take the values specified in the prospective context before compression is attempted, as if they do not then compression will not succeed. A.1.2. Step 2: State transition logic at the compressor The state machine for the compressor is illustrated below. Details of the transitions between states are given after the figure. Note that if compression fails in the chosen state, the compressor can return to Step 2 at any time and choose a different state in which to attempt compression Price et al. [PAGE 45] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 Optimistic approach +------>------>------>------>------>------>------>------>------+ | | | Optimistic approach Optimistic approach | | +------>------>------+ +------>------>------+ | | | | | | | | | v | v v +----------+ +--------------+ +----------+ | IR State | | IR-DYN State | | CO State | +----------+ +--------------+ +----------+ ^ ^ | ^ | | | | Timeout | | Timeout / Update | | | +------<------<------+ +------<------<------+ | | | | Timeout | +------<------<------<------<------<------<------<------<------+ Figure 6 : State transition logic at the compressor The compressor initially begins in the IR state. The transition logic for compression states is based on three principles: the optimistic approach principle, timeouts, and the need for updates. A.1.2.1. Condition for upwards transition: Optimistic approach Transition to a higher compression state is carried out according to the optimistic approach principle. This means that the compressor transits to a higher state when it is fairly confident that the decompressor has received enough information to correctly decompress packets sent according to the state. In EPIC-LITE this is instantiated by choosing the number of field values to store in the context at the compressor. Compression can always be attempted in any state; however it will fail if the context contains insufficient information to transmit the packet in the selected state. In practice the compressor always remains in CO state except for the following two cases: A.1.2.2. Condition for downwards transition: Timeouts When the optimistic approach is taken as described above, there will always be a possibility of failure since the contexts at the compressor and decompressor may have become out of sync. Therefore, the compressor MUST periodically erase the values stored in its context. This forces the compressor to transit to a lower state (IR or IR-DYN) in order to successfully compress subsequent packets. The duration between periods is implementation-specific. Note that different fields can have different lifetimes in the context: for example STATIC-UNKNOWN fields do not change value for the duration of the context, and thus are unlikely to become out of sync between the compressor and the decompressor. Price et al. [PAGE 46] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 A.1.2.3. Condition for downwards transition: Need for updates In addition to the downward state transitions carried out due to periodic timeouts, the compressor must also immediately transit back to the IR or IR-DYN state when the header cannot be compressed in the CO state (for example if it behaves in an unexpected manner that cannot be accommodated by any of the CO header formats). A.1.3. Step 3: Compressing the header The next step is to choose the compressed header format that will be used to transmit the header from the compressor to the decompressor. Given the selected profile the compressor has exactly max_sets + 2 possible sets of header formats available: a total of max_sets different sets of CO packets, as well as a set of IR-DYN packets and a set of IR packets. The choice of which header formats to use depends on the current state of the state machine. The compressor calls the procedure COMPRESS to compress the header. The procedure has a single input: the name of the encoding method that is currently being used. Note that is not necessary to provide EPIC-LITE with a description of where the fields occur in the uncompressed header, as this information is provided automatically as part of each encoding method (written in the input language). The procedure has a single output: a Boolean value indicating whether or not compression has successfully occurred. Additionally, if compression succeeds then the bit string compressed_data will contain the compressed value of every field in the uncompressed header. The procedure also modifies the value of the chosen_encoding global variable. If compression is successful then for every field in the uncompressed header, chosen_encoding will contain an integer indicating which encoding method has been selected for the field. This is mapped onto a set of indicator flags in Section A.1.4. Initially, the COMPRESS procedure is called for the encoding method specified in CO_packet (or IR-DYN_packet or IR_packet depending on the compressor state). The procedure may call itself recursively with a different input (for example the TCP/IP encoding method may recursively call COMPRESS for the two encoding methods IPv6-ENCODING and TCP-ENCODING). Note however that the only local variable is a pointer to the current encoding method, so the procedure can be implemented in a non- recursive manner using a heap if desired. If the encoding method is specified as part of the EPIC-LITE library, the pseudocode for the COMPRESS procedure is specified separately (in Appendix A.4.). Price et al. [PAGE 47] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 procedure COMPRESS (encoding_name) i = locate-first(encoding_name) while (i <> 0) do temp_field = current_field j = 0 do can_compress = COMPRESS (encoding_name[i + j]) if (encoding_name[i + j].no_more = true and can_compress = false) then return false j = j + 1 loop until can_compress = true if (current_field > temp_field) then chosen_encoding[temp_field] = j end if i = encoding_name[i + j].forwards end while end COMPRESS Note that once the header has been compressed, the variable hdr contains the uncompressed value of every field. This information is then used to overwrite one of the r copies of the context stored at the compressor. A.1.4. Step 4: Determining the indicator flags The next step is to determine the correct indicator flags for the chosen compressed header format. The compressor must have available the following procedure: INDICATOR-FLAGS (chosen_encoding) Appends a bit string containing the indicator flags for chosen_encoding to the front of compressed_data In general this procedure is implemented by calculating the indicator flags offline and storing them in a list for fast access by the compressor. Therefore the procedure is described in Section A.3. Once the indicator flags have been added, the header is padded to a multiple of bit_alignment bits by increasing the number of bits of Master Sequence Number (MSN) provided in the compressed header. Additionally, any remaining uncompressed_data not compressed by EPIC- LITE is appended to the end of compressed_data to form the entire EPIC-LITE compressed packet. Note that this includes the payload data for the packet. Price et al. [PAGE 48] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 A.1.5. Step 5: Encapsulating in ROHC packet The last step is to encapsulate the EPIC-LITE compressed packet within a ROHC packet. The encapsulation for each packet type (CO, IR- DYN and IR) is described in Chapter 5. Note that this includes adding the CID and any other ROHC framework headers (segmentation, padding etc.) as described in [ROHC]. The ROHC packet is then ready to be transmitted. A.2. Decompressor This section describes the EPIC-LITE header decompressor. A.2.1. Step 1: Decapsulating from ROHC packet The input to the EPIC-LITE decompressor is a compressed ROHC packet. The first step is to read the CID of the packet and to extract the EPIC-LITE packet for parsing by the appropriate profile. If the ROHC packet is identified as containing an EPIC-LITE compressed packet then the decompression process continues as indicated below. A.2.2. Step 2: State transition logic at the decompressor The state machine for the decompressor is illustrated below. Details of the transitions between states are given after the figure. Success +-->------>------>------>------>------>--+ | | Not IR packet| Insufficient Checksum Success | Success +-->--+ | +-->--+ +--->----->---+ +-->--+ | | | | | | | | | | v | | v | v | v +--------------+ +----------------+ +--------------+ | No Context | | Static Context | | Full Context | +--------------+ +----------------+ +--------------+ ^ | ^ | | k_2 out of n_2 failures | | k_1 out of n_1 failures | +-----<------<------<-----+ +-----<------<------<-----+ Figure 7 : State transition logic at the decompressor The decompressor initially begins in the No Context state. The transition logic for decompression states is based on two principles: successful decompression (as verified by the checksum) and repeated failures. A.2.2.1. Condition for upwards transition: Successful decompression Successful decompression will always move the decompressor to the Full Context state. The definition of successful decompression is Price et al. [PAGE 49] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 that the reconstructed header is verified by the checksum covering the entire uncompressed header as per Appendix A.2.5. A.2.2.2. Condition for downwards transition: Repeated failures Repeated failed decompression (as detected by the checksum on the uncompressed header) will force the decompressor to transit downwards to a lower state. In Full Context state, decompression may be attempted regardless of what kind of packet is received. However for the other states decompression is not always allowed. In the No Context state only IR packets may be decompressed, because these are the only packets to carry the STATIC-UNKNOWN field values in full. Further, when in the Static Context state only packets carrying at least 7 bits of header checksum can be decompressed. If decompression may not be performed the packet is discarded. A.2.3. Step 3: Reading the indicator flags The input to Step 3 is an EPIC-LITE compressed packet. Note that the overall length of the packet is known from the link layer, but the length of the compressed header itself is NOT known. The first step is to determine the compressed header format. This is accomplished by reading the indicator flags as per the following procedure: READ-FLAGS (packet) Outputs the value of chosen_encoding for the EPIC-LITE packet Since this algorithm identifies the format of the compressed header, it implicitly discovers the compressed header length at the same time. This allows the compressed_data to be separated from the uncompressed_data appended to the end of the packet. The algorithm itself is described in Section A.3. A.2.4. Step 4: Decompressing the fields Now that the format of the compressed header has been determined, the next step is to decompress each field in turn. The decompressor calls the procedure DECOMPRESS to calculate the uncompressed value of the fields. The only input to the procedure is the name of an encoding method. Unlike the COMPRESS procedure there are no outputs since decompression always succeeds (although if the packet is corrupted, the correct answer may not be obtained). Initially, the DECOMPRESS procedure is called for the encoding method specified in CO_packet (or IR-DYN_packet or IR_packet depending on the ROHC packet type received). Note that as for COMPRESS the procedure may call itself recursively with different inputs. Price et al. [PAGE 50] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 procedure DECOMPRESS (encoding_name) i = locate-last(encoding_name) while (i <> 0) do i = i + chosen_encoding[current_field] DECOMPRESS (encoding_name[i]) i = encoding_name[i].backwards end while end DECOMPRESS Observe that the DECOMPRESS procedure reads the input code in the opposite order to the COMPRESS procedure. This is because decompression is the exact mirror-image of compression: if fields are parsed in reverse order then it will never be the case that a field can only be decompressed relative to a field that has not yet been reached. A.2.5. Step 5: Verifying correct decompression By this stage the decompressor has calculated the value uncompressed_data that contains the entire uncompressed header as well as the payload. The final step is to verify that successful decompression has occurred by applying the checksum to the uncompressed header. The CRC encoding method makes available the variables checksum_value (containing the checksum from the compressed header) and crc_static + crc_dynamic (containing all of the fields in the uncompressed header). The CRC should be evaluated over crc_static + crc_dynamic and compared with the CRC stored in checksum_value. If the uncompressed header fails the checksum then it should be discarded. If it passes then it can be forwarded by the decompressor. Furthermore, if decompression is successful and sufficient checksum is provided then the values contained within context can be replaced by the values contained within hdr. A.3. Offline processing This section describes how the profile is converted into one or more sets of compressed header formats. Note that the following algorithms are run once offline and the results stored at the compressor and decompressor. A.3.1. Step 1: Building the header formats The first step is to build up a list of the max_formats different compressed header formats that occur with the highest probability (based on the probability values given in the input code). Price et al. [PAGE 51] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 To generate the max_sets + 2 different sets of compressed header formats, the BUILD procedure is called max_sets times with the global variable compressor_state set to "CO" and current_set taking values from 0 to max_sets - 1 inclusive. Additionally it is called once with compressor_state = "IR" and once with compressor_state = "IR-DYN". The output in each case is a list describing the top max_formats different compressed header formats. The list has the following attributes: list.size Number of items in list list[j].P Overall percentage probability that the header format j will be used list[j].N Total size of header format j in bits, excluding indicator flags list[j].id A list of integers uniquely identifying the header format j Note that all percentages are stored to exactly 2 decimal places (or by scaling they can be stored as a 2-octet integer from 0 to 10000 inclusive). When two percentages are multiplied, the result MUST be calculated exactly (i.e. to 6 decimal places, or equivalently a 4- octet integer) and then rounded off to 2 decimal places. The following procedure is required by the BULID procedure: DISCARD (list) Discards all but max_formats entries in list, keeping only those entries j with the highest list[j].P values For interoperability the top max_formats entries MUST NOT be reordered when the discarding process is carried out. In the event of a tie, the list entries with the lowest indices are kept. Note that the BUILD procedure has a single input, which is the name of an encoding method. Initially it is called for the encoding method specified in CO_packet (or IR-DYN_packet or IR_packet depending on compressor_state). Moreover the procedure may call itself recursively a different input. procedure BUILD (encoding_name) i = locate-first(encoding_name) list.size = 1 list[0].P = 100% list[0].N = [1] list[0].id = empty while (i <> 0) do temp_list = empty Price et al. [PAGE 52] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 do list_output = BUILD (encoding_name[i]) for j = 0 to list_output.size - 1 do list_output[j].id[list_output[j].id.size] = i list_output[j].id.size = list_output[j].id.size + 1 temp_list[temp_list.size + j].P = list_output[j].P temp_list[temp_list.size + j].N = list_output[j].N temp_list[temp_list.size + j].id = list_output[j].id end loop temp_list.size = temp_list.size + list_output.size DISCARD (list) loop until encoding_name[i].no_more = true new_list = empty for j = 0 to temp_list.size - 1 do for k = 0 to list.size - 1 do n = new_list.size new_list[n + k].P = temp_list[j].P * list[k].P new_list[n + k].N = temp_list[j].N + list[k].N for m = 0 to temp_list[j].id.size - 1 do new_list[n + k].id[list[k].id.size + m] = temp_list[j].id[m] end loop end loop DISCARD (new_list) end loop list = new_list i = encoding_name[i].forwards end while for j = 0 to list.size - 1 do list[j].P = list[j].P * encoding_name[i].prob end loop return list end BUILD The final output of BUILD is a list variable describing max_formats different compressed header formats. Price et al. [PAGE 53] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 A.3.2. Step 2: Generating the indicator flags The final step of generating a new set of compressed header formats is to convert the list of probabilities into a set of indicator flags. Each header format begins with a unique pattern of indicator flags that serve to distinguish it from all other header formats in the set. EPIC-LITE generates the indicator flags using ordinary Huffman compression. For each of the cases in Section A.3.1 where the BUILD algorithm is run the following algorithm should be applied to the output of BUILD: procedure BUILD-FLAGS (list) Sort list into ascending order of list[j].P values (preserving the original order where ties are encountered). if compressor_state <> "CO" then npatterns = 2^bit_alignment u = 0 v = max_formats for w = maxformats to 2 * max_formats û 2 do if w = max_formats - 4 and bit_alignment = 8 and npatterns = 224 then RESERVE if (((w = v) or (list[u + 1].P <= list[v].P) and (u < (max_formats û 1))) then list[w].P = list[u].P + list[u + 1].P list[u].pointer = w list[u + 1].pointer = w u = u + 2 else if (((u > (max_formats û 1) or (list[v + 1].P <= list[u].P)) and (v + 1 < w)) then list[w].P = list[v].P + list[v + 1].P list[v].pointer = w list[v + 1].pointer = w v = v + 2 else list[w].P = list[u].P + list[v].P list[u].pointer = w list[v].pointer = w u = u + 1 v = v + 1 end if end loop for w = 0 to max_formats û 1 do z = w list[w].flaglength = 0 Price et al. [PAGE 54] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 do z = list[z].pointer list[w].flaglength = 1 + list[w].flaglength if (z = taken[0]) then list[w].flaglength = 1 + list[w].flaglength if (extra = 1 and z = taken[1]) then list[w].flaglength = 1 + list[w].flaglength loop until (list[z].pointer = 0) end loop value = 0 w = max_formats - 1 while (w >= 0) do list[w].flags = str(list[w].flags, value) value = (value + 1) * 2^(list[w].flaglength - list[w + 1].flaglength) w = w - 1 end while end BUILD-FLAGS The procedure RESERVE is called at most once by BUILD-FLAGS to reserve the bit pattern "111" in the first octet of each compressed header (for compatibility with the ROHC framework). Note that all variables are shared between the two procedures. procedure RESERVE temp_u = u temp_v = v for w = 0 to 3 do if (((w + max_formats - 4 = temp_v) or (list[temp_u].P <= list[temp_v].P) and(temp_u < (max_formats û 1))) then probs[w] = list[temp_u].P taken[w] = temp_u temp_u = temp_u + 1 else probs[w] = list[temp_v].P taken[w] = temp_v temp_v = temp_v + 1 end if next i if (probs[0] + 2 * probs[1] < probs[3]) then extra = 1 else extra = 0 end RESERVE The output of BUILD-FLAGS is a list of bit strings list[j].flags containing the flags used to indicate each compressed header format. Price et al. [PAGE 55] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 Note that the compressor assigns bit patterns to the indicator flags using the following rules: 1.) The most probable headers have all "0" indicator flags 2.) The indicator flags for the next header format are calculated by adding 1 to the previous flags (treated as an integer) and padding with enough 0s to reach the correct length As an example, the indicator flags for a set of compressed header formats are given below: Compressed header format No. of flags Bit pattern of flags 1 2 00 2 2 01 3 3 100 4 4 1010 5 4 1011 6 4 1100 7 5 11010 8 6 110110 9 6 110111 Note that the most probable compressed header format will have all "0" indicator flags, whilst the least probable header format will have all "1" indicator flags (except for the bit pattern "111" if this is reserved for the [ROHC] framework). The task of the compressor is to calculate the indicator flags for the selected compressed header format. The simplest method is just to store the list containing the indicator flags for each compressed header, and to choose list[j].flags such that chosen_encoding matches list[j].id. The decompressor simply stores the reverse of this mapping. A.4. Library of encoding methods This section gives pseudocode for each of the encoding methods in the library. Note that for each encoding method three pieces of pseudocode are given: corresponding to the procedures COMPRESS, DECOMPRESS and BUILD described previously. Note that all of the variables required for these procedures are defined at the beginning of Appendix A. It is assumed that as soon as the 'return' command is encountered, the procedure stops. Moreover, for each field in the context EPIC-LITE needs to know two non-negative integers: the length of the field and its value. The following functions are used to manipulate fields: Price et al. [PAGE 56] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 bits.len The integer length of a bit string (e.g. with bits = hdr[current_field]) bits.val The integer value of a bit string str(n, v) A bit string of length n and value v (so that bits = str(bits.len, bits.val) left(bits, n) The left n bits of a bit string right(bits, n) The right n bits of a bit string bits_a + bits_b The concatenation of two bit strings Additionally, the following general procedures are used in the pseudocode descriptions: procedure REMOVE (data_string, n, variable) # This procedure extracts the n bits from data_string variable = left(data_string, n) data_string = right(data_string, data_string.len - n) end REMOVE procedure APPEND (data_string, bits) # This procedure appends a string of bits to data_string data_string = bits + data_string end APPEND procedure NEXT (data_string, n) # This procedure returns the value of the next n bits of # data_string return left(data_string, n).val end NEXT procedure CONTEXT (variable, i) # This procedure extracts the current field value from the context variable = context[current_field, i] end CONTEXT Note that at the decompressor the parameter i is omitted (as there is only one value for each field in the context). Price et al. [PAGE 57] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 procedure NEW-CONTEXT (bits) # This procedure places a bit string into the new context. hdr[current_field] = bits current_field = current_field + 1 end NEW-CONTEXT A.4.1. STATIC P% COMPRESS: CONTEXT (context, 1) n = context.len for i = 1 to r do CONTEXT (context, i) if (context.val <> NEXT (uncompressed_data, n)) then return false end loop REMOVE (uncompressed_data, n, hdr) NEW-CONTEXT (hdr) return true DECOMPRESS: CONTEXT (hdr) APPEND (uncompressed_data, hdr) NEW-CONTEXT (hdr) BUILD: format.size = 1 format[0].P = P format[0].N = 0 format[0].id = empty return format A.4.1.1. STATIC-KNOWN(n,v) COMPRESS: if (NEXT (uncompressed_data, n) <> v) then return false REMOVE (uncompressed_data, n, hdr) return true DECOMPRESS: hdr = str(n, v) APPEND (uncompressed_data, hdr) BUILD: format.size = 1 format[0].P = 100% format[0].N = 0 format[0].id = empty return format A.4.1.2. STATIC-UNKNOWN(n) COMPRESS: if compressor_state = "IR" then return true for i = 1 to r do CONTEXT (context, i) if (context.val <> NEXT (uncompressed_data, n)) then return false Price et al. [PAGE 58] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 end loop REMOVE (uncompressed_data, n, hdr) if (compressor_state = "IR") then APPEND (compressed_data, hdr) end if NEW-CONTEXT (hdr) return true DECOMPRESS: if compressor_state = "IR" then REMOVE (compressed_data, n, hdr) else CONTEXT (hdr) end if APPEND-U (hdr) NEW-CONTEXT (hdr) BUILD: format.size = 1 format[0].P = 100% if (compressor_state = "IR") then format[0].N = n else format[0].N = 0 end if format[0].id = empty return format A.4.2. IRREGULAR(n) P% COMPRESS: if (uncompressed_data.len < n) then return false REMOVE (uncompressed_data, n, hdr) APPEND (compressed_data, hdr) NEW-CONTEXT (hdr) return true DECOMPRESS: REMOVE (compressed_data, n, hdr) APPEND (uncompressed_data, hdr) NEW-CONTEXT (hdr) BUILD: format.size = 1 format[0].P = P format[0].N = n format[0].id = empty return format A.4.3. VALUE(n,v) P% COMPRESS: if (NEXT (uncompressed_data, n) <> v) then return false REMOVE (uncompressed_data, n, hdr) NEW-CONTEXT (hdr) return true DECOMPRESS: hdr = str(n, v) APPEND (uncompressed_data, hdr) NEW-CONTEXT (hdr) Price et al. [PAGE 59] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 BUILD: format.size = 1 format[0].P = P format[0].N = 0 format[0].id = empty return format A.4.4. LSB(k,p) P% COMPRESS: CONTEXT (context, 1) n = context.len new_value = NEXT (uncompressed_data, n) for i = 1 to r do CONTEXT (context, i) old_value = context.val temp = (new_value - old_value + p) mod 2^n if temp < 0 or temp >= 2^k then return false end loop REMOVE (uncompressed_data, n, hdr) APPEND (compressed_data, right(hdr, k)) NEW-CONTEXT (hdr) return true DECOMPRESS: CONTEXT (context) n = context.len REMOVE (compressed_data, k, temp) old_value = context.val new_value = (left(context, n - k) + temp).val new_value = ((new_value - old_value + p) mod 2^k + old_value - p) mod 2^n hdr = str(n, new_value) APPEND (uncompressed_data, hdr) NEW-CONTEXT (hdr) BUILD: format.size = 1 format[0].P = P format[0].N = k format[0].id = empty return format A.4.4.1. LSB-PADDED(n,k) P% COMPRESS: if (uncompressed_data.len < n or NEXT (uncompressed_data, n - k)) <> 0) then return false REMOVE (uncompressed_data, n, hdr) APPEND (compressed_data, right(hdr, k)) NEW-CONTEXT (hdr) return true DECOMPRESS: REMOVE (compressed_data, k, temp) hdr = str(n - k, 0) + temp APPEND (uncompressed_data, hdr) NEW-CONTEXT (hdr) Price et al. [PAGE 60] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 BUILD: format.size = 1 format[0].P = P format[0].N = k format[0].id = empty return format A.4.5. UNCOMPRESSED(n,d,m,p) COMPRESS: scale_len = floor(NEXT (control_data, n)) / d) * m + p) if (uncompressed_data.len < scale_len) then return false REMOVE (uncompressed_data, scale_len, hdr) uncompressed_data = uncompressed_data + hdr REMOVE (control_data, n, length) APPEND (uncompressed_data, length) return true DECOMPRESS: REMOVE (uncompressed_data, n, length) APPEND (control_data, length) scale_len = floor(length.val) / d) * m + p hdr = right(uncompressed_data, scale_len) uncompressed_data = left(uncompressed_data, uncompressed_data.len - scale_len) APPEND (uncompressed_data, hdr) BUILD: return empty A.4.6. INFERRED(n) COMPRESS: REMOVE (uncompressed_data, n, hdr) APPEND (control_data, hdr) return true DECOMPRESS: REMOVE (control_data, n, hdr) APPEND (uncompressed_data, hdr) BUILD: return empty A.4.6.2. INFERRED-SIZE(n,p) COMPRESS: if (8 * NEXT (uncompressed_data, n)) + p <> uncompressed_data.len) then return false REMOVE (uncompressed_data, n, hdr) return true DECOMPRESS: hdr = str(n, (uncompressed_data.len + n - p) / 8) APPEND (uncompressed_data, hdr) BUILD: return empty A.4.6.3. INFERRED-OFFSET(n,flag) COMPRESS: REMOVE (uncompressed_data, n, hdr) if (flag = 1) then REMOVE (control_data, n, base) Price et al. [PAGE 61] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 APPEND (uncompressed_data, base) else if MSN is empty then choose MSN to be any n-bit value base = MSN end if offset = str(n, (hdr.val - base.val) mod 2^n) APPEND (uncompressed_data, offset) return true DECOMPRESS: REMOVE (uncompressed_data, n, offset) if (flag = 1) then REMOVE (uncompressed_data, n, base) APPEND (control_data, base) else base = MSN end if hdr = str(n, (offset.val + base.val) mod 2^n) APPEND (uncompressed_data, hdr) BUILD: return empty A.4.6.4. INFERRED-SCALED(n) COMPRESS: REMOVE (uncompressed_data, n, hdr) if MSN is empty then choose MSN to be any n-bit value base = MSN choose scale to be any n-bit value choose nbo to be 0 or 1 offset = str(n, (hdr.val - scale * base.val) mod 2^n) APPEND (uncompressed_data, scale) APPEND (uncompressed_data, nbo) APPEND (uncompressed_data, offset) return true DECOMPRESS: REMOVE (uncompressed_data, offset) REMOVE (uncompressed_data, nbo) REMOVE (uncompressed_data, scale) hdr = str(n, (offset.val + scale * base.val) mod 2^n) APPEND (uncompressed_data, hdr) BUILD: return empty A.4.6.5. INFERRED-IP-CHECKSUM COMPRESS: if (right(left(uncompressed_data, 96), 16) <> chksm(left(uncompressed_data, 160))) then return false # The "chksm" function calculates a 16-bit one's # complement checksum (with the 11th and 12th octets # set to zero). REMOVE (uncompressed_data, 160, hdr) APPEND (uncompressed_data, left(hdr, 80) + right(hdr, 64)) Price et al. [PAGE 62] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 return true DECOMPRESS: REMOVE (uncompressed_data, 144, temp) temp = left(temp, 80) + str(16, 0) + right(temp, 64) hdr = left(temp, 80) + checksum(temp) + right(temp, 64) APPEND (uncompressed_data, hdr) BUILD: return empty A.4.7. OPTIONAL(new_encoding_name) COMPRESS: REMOVE (control_data, 1, present) if (compressor_state = "CO") then for i = 1 to r do CONTEXT (context, i) if (context <> present) then return false end loop end if if (present = 1) then return COMPRESS (new_encoding_name) else format = BUILD (new_encoding_name) APPEND (compressed_data, str(format[0].N, 0)) return true end if if (compressor_state <> "CO") then APPEND (compressed_data, present) end if NEW-CONTEXT (present) DECOMPRESS: if (compressor_state <> "CO") then REMOVE (compressed_data, 1, present) else CONTEXT (present) end if if (present = 1) then DECOMPRESS (new_encoding_name) else format = BUILD (new_encoding_name) REMOVE (compressed_data, str(format[0].N, 0), temp) end if NEW-CONTEXT (present) BUILD: if (compressor_state = "CO") then return BUILD (new_encoding_name) else format = BUILD (new_encoding_name) for j = 0 to format.size - 1 do format[j].N = format[j].N + 1 end loop return format end if Price et al. [PAGE 63] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 A.4.8. CONTEXT(new_encoding_name,k) COMPRESS: n = ceiling(log2(k)) choose (j < k) COMPRESS (new_encoding_name) APPEND (uncompressed_data, str(n, j)) DECOMPRESS: temp = hdr[field_name + ".Index"] DECOMPRESS (new_encoding_name, (field_name + temp)) BUILD: return BUILD (new_encoding_name) A.4.9.1. LIST-NEXT(n, encoding_name[0],...,encoding_name[k - 1],v[0],...,v[j - 1]) COMPRESS: for i = 0 to k - 1 do present[i] = 0 end loop i = 0 while (i <= k) do REMOVE (control_data, n, next) v = next.val while (v[i] <> v or present[i] = 1) and i <= j) i = i + 1 end while APPEND (control_data, str(1, 1)) can_compress = COMPRESS (encoding_name[i]) if (can_compress = false) then return false end while if for i = 0 to k - 1 do if (present[i] = 0 then can_compress = COMPRESS (encoding_name[i]) if (can_compress = false) then return false end if end loop return true DECOMPRESS: while last_value[list_name] > 0 do last_value[list_name] i = last_value[list_name] DECOMPRESS (encoding_name[i], field_name + app(i)) end while BUILD: format.size = 1 format[0].P = 100% format[0].N = [1] format[0].id = empty for i = 0 to k - 1 do temp_format = BUILD (encoding_name[i]) end loop return format Price et al. [PAGE 64] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 A.4.10.1. new_encoding_name C COMPRESS: if (compressor_state <> "CO") then return false return COMPRESS (new_encoding_name) DECOMPRESS: DECOMPRESS (new_encoding_name) BUILD: if (compressor_state <> "CO") then return empty else return BUILD (new_encoding_name) A.4.10.2. new_encoding_name D COMPRESS: if (compressor_state = "CO") then return false return COMPRESS (new_encoding_name) DECOMPRESS: DECOMPRESS (new_encoding_name) BUILD: if (compressor_state = "CO") then return empty else return BUILD (new_encoding_name) A.4.10.3. new_encoding_name N COMPRESS: can_compress = COMPRESS (new_encoding_name) current_field = current_field - 1 return can_compress DECOMPRESS: DECOMPRESS (new_encoding_name) current_field = current_field - 1 BUILD: return BUILD (new_encoding_name) A.4.11. FORMAT(encoding_name[0],...,encoding_name[k - 1]) COMPRESS: n = ceiling(log2(k)) if compressor_state <> "CO" then choose (j < k) can_compress = COMPRESS (encoding_name[j]) index = str(n, j) else CONTEXT (index, 1) for i = 2 to r do CONTEXT (context, i) if (index <> context) then return false end loop j = index.val end if can_compress = COMPRESS (encoding_name[j]) if can_compress = false then return false current_set = current_set + max_sets * j max_sets = max_sets * k APPEND (uncompressed_data, index) NEW-CONTEXT (index) return true Price et al. [PAGE 65] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 DECOMPRESS: n = ceiling(log2(k)) if compressor_state <> "CO" then REMOVE (uncompressed_data, n, index) else CONTEXT (index) end if j = index.val current_set = current_set * k + j NEW-CONTEXT (index) DECOMPRESS (encoding_name[j]) BUILD: if compressor_state = "CO" then j = current_set mod k current_set = floor(current_set / k) return BUILD (encoding_name[j]) else format = empty for i = 0 to k - 1 do temp_fmt = BUILD (encoding_name[i]) for j = 0 to temp_fmt.size - 1 do temp_fmt[j].id[temp_fmt[j].id.size] = i temp_fmt[j].id.size = temp_fmt[j].id.size + 1 format[format.size + j].P = temp_fmt[j].P format[format.size + j].N = temp_fmt[j].N format[format.size + j].id = temp_fmt[j].id end loop format.size = format.size + temp_fmt.size end loop DISCARD (format) return format end if A.4.12. CRC(n) P% COMPRESS: APPEND (compressed_data, crc(n, crc_static + crc_dynamic)) # The "crc" function calculates an n-bit CRC over the # specified field. return true DECOMPRESS: REMOVE (compressed_data, n, temp) checksum_value = temp BUILD: format.size = 1 format[0].P = P format[0].N = 0 format[0].id = empty return format Price et al. [PAGE 66] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 A.5. BNF description of the input language The following is a BNF description of a [ROHC] profile generated using EPIC-LITE: ::= [] [] ::= "profile_identifier" ::= "max_formats" ::= "max_sets" ::= "bit_alignment" ::= "npatterns" ::= "CO packet" ::= "IR-DYN packet" ::= "IR packet" ::= {} ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ::= "0x" {} ::= | "a" | "b" | "c" | "d" | "e" | "f" | "A" | "B" | "C" | "D" | "E" | "F" The following is a BNF description of a new encoding method written using the input language (note that the previous BNF definitions still apply). Moreover, any line beginning with a "#" symbol is ignored in the input language. ::= "method" {} "end_method" ::= "encode" "as" {"or" ) Price et al. [PAGE 67] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 ::= ::= | ::= ["(" {"," } ")"] [] ::= ::= { | | "_" | "-" | "/" | "."} ::= "A" | "B" | "C" | ... | "X" | "Y" | "Z" | "a" | ... | "z" ::= [] [] ["." []] "%" ::= | | | ::= | | ::= "0b" {} ::= "0" | "1" ::= ::= ["-"] Appendix B. Extensibility This appendix considers the extensibility of the EPIC-LITE scheme, including possible future enhancements to EPIC-LITE. B.1. Other protocol stacks A number of additional protocol stacks have been identified by the ROHC Working Group as potential candidates for compression. For example, the Stream Control Transmission Protocol (SCTP) is gaining popularity as an efficient transmission protocol for signaling messages and other types of data. Using EPIC-LITE the additional effort required to generate a new [ROHC] profile is very low. For most protocol stacks it is sufficient to write a new encoding method using the input language for any protocol not already handled. The encoding methods can be combined with those already written for TCP/IP to create a [ROHC] profile for the efficient compression of the new protocol stack. Price et al. [PAGE 68] INTERNET-DRAFT TCP/IP Compression for ROHC 9 July, 2001 B.2. New library encoding methods The encoding methods currently offered by the EPIC-LITE library are sufficient to compress TCP/IP and many other protocol stacks. However, in some cases it may become necessary to add new encoding methods to the library. In general, a new library encoding method is just a mapping between a string of uncompressed data and its compressed equivalent. The only requirement is that the mapping is 1-1 (in other words that no two field values map onto the same compressed value). It is acceptable for the mapping to change depending on the context, although care must be taken in this case to ensure that it is robust. For example, a DELTA encoding method that stores the increase in the field value relative to the context is very efficient for compressing incrementing fields. However it is not robust to lost packets (since it fails if the decompressor context is incorrect). B.3. Learning version of EPIC-LITE An interesting question is the effectiveness of EPIC-LITE when compressing a protocol stack for which it is not optimally programmed. If the correct encoding methods have been assigned to each field but the probabilities that they will be used are slightly inaccurate then the efficiency lost is negligible. But suppose that the protocol stack behaves in a significantly different manner than expected: for example if an IP header uses IP options. Static compressors with fixed compressed header formats suffer a permanent drop in performance if the protocol stack behaves in an unusual manner. However, since EPIC-LITE can dynamically generate new packet formats based on the input code, it is possible for a ROHC profile generated by EPIC-LITE to adapt its own header formats to the incoming packet stream. A basic version of "Learning" EPIC-LITE is straightforward to implement. The encoding methods assigned to each field are not changed; instead the probabilities that each encoding method will be used are refined by counting the number of times they have been used in recent packets. In this manner, if a particular compressed header format is used more often than expected then the indicator flags will be refined to encode the format using fewer bits. Fast algorithms exist for running the Huffman algorithm when only a small number of input probabilities have changed. Care must be taken to ensure that the compressed header formats at the compressor and decompressor are updated in step; this can be accomplished by periodically transmitting the list of probabilities between the compressor and the decompressor within a special "profiling" message. Price et al. [PAGE 69]