Network Working Group Zhigang Liu INTERNET-DRAFT Khiem Le Date: 18 July 2001 Ka-Cheong Leung Expires: 18 January 2002 Christopher Clanton Nokia Research Center Scalable, Robust, Efficient Dictionary-Based Compression (SCRIBE) Status of This Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC 2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This document is a submission to the IETF ROHC WG. Comments should be directed to its mailing list, rohc@cdt.luth.se. Abstract This document defines an efficient, robust, and scalable scheme for message compression when the transmission path between the compressor and decompressor is unreliable, i.e. prone to errors, losses, and misordering. The scheme is applicable in particular to compression of SIP, RTSP, SDP, and HTTP messages. A major case of application is the compression of SIP protocol messages over bandwidth limited links, such as cellular links. Liu, Le, Leung, and Clanton [Page i] INTERNET-DRAFT SCRIBE 18 July 2001 Table of Contents Status of This Memo . . . . . . . . . . . . . . . . . . . . . . . i Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . i 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. Concepts and Framework . . . . . . . . . . . . . . . . . . . . 2 2.1. Algorithm Requirements and Assumptions . . . . . . . . . 2 2.2. Robust Signalling Compression Framework . . . . . . . . . 3 2.3. CD Data Organization . . . . . . . . . . . . . . . . . . 5 2.3.1. Static Data . . . . . . . . . . . . . . . . . . . . . . 5 2.3.2. Profile-Specific Data . . . . . . . . . . . . . . . . . 5 2.3.2.1. Profile-Specific Dictionary . . . . . . . . . . . . . 6 2.3.2.2. Template Table . . . . . . . . . . . . . . . . . . . 6 2.3.3. Dynamic Data . . . . . . . . . . . . . . . . . . . . . 6 2.3.4. Temporary Data Stores and CD Data Construction . . . . 7 2.4. CD Algorithm . . . . . . . . . . . . . . . . . . . . . . 8 2.4.1. Dictionary-Based Algorithm . . . . . . . . . . . . . . 9 2.4.2. Template-Based Algorithm . . . . . . . . . . . . . . . 10 2.5. CD Data Management Signalling . . . . . . . . . . . . . . 11 2.5.1. State Verification and Data Downloading . . . . . . . . 12 2.5.2. Sliding Handshake . . . . . . . . . . . . . . . . . . . 12 2.5.3. Total Order . . . . . . . . . . . . . . . . . . . . . . 13 2.5.4. Checkpoint . . . . . . . . . . . . . . . . . . . . . . 13 2.5.5. Rollback . . . . . . . . . . . . . . . . . . . . . . . 14 2.6. CD Data Update Algorithm . . . . . . . . . . . . . . . . 14 2.7. S-Profile . . . . . . . . . . . . . . . . . . . . . . . . 14 2.8. SCRIBE Message . . . . . . . . . . . . . . . . . . . . . 16 3. Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1. State Verification and Data Downloading . . . . . . . . . 17 3.2. Dynamic CD Data Maintenance . . . . . . . . . . . . . . . 20 3.2.1. Sliding Handshake . . . . . . . . . . . . . . . . . . . 21 3.2.2. Specification of Ordering Constraints . . . . . . . . . 24 3.2.3. Dynamic CD Data Update Mechanism . . . . . . . . . . . 27 3.3. Checkpoint Establishment Procedure . . . . . . . . . . . 29 3.4. Rollback Procedure . . . . . . . . . . . . . . . . . . . 31 3.5. Template Related Procedures . . . . . . . . . . . . . . . 33 3.5.1. Compression and Decompression Procedures . . . . . . . 33 3.5.2. Template Management Procedures . . . . . . . . . . . . 34 3.6. Implicit CD Data Content Deletion . . . . . . . . . . . . 34 3.7. State Diagrams . . . . . . . . . . . . . . . . . . . . . 36 3.7.1. Checkpoint Establishment Procedure . . . . . . . . . . 36 3.7.2. Rollback Procedure . . . . . . . . . . . . . . . . . . 37 3.7.3. State Verification and Data Downloading Procedure . . . 40 Liu, Le, Leung, and Clanton [Page ii] INTERNET-DRAFT SCRIBE 18 July 2001 4. SCRIBE Message Formats . . . . . . . . . . . . . . . . . . . . 43 4.1. Normal Message Formats . . . . . . . . . . . . . . . . . 43 4.1.1. Short Sequence Number . . . . . . . . . . . . . . . . . 44 4.1.2. Long Sequence Number . . . . . . . . . . . . . . . . . 46 4.1.3. No Sequence Number . . . . . . . . . . . . . . . . . . 46 4.1.4. Acknowledgement (ACK_INFO) . . . . . . . . . . . . . . 47 4.1.5. CONTROL_INFO . . . . . . . . . . . . . . . . . . . . . 50 4.1.5.1. Checkpoint CONTROL_INFO . . . . . . . . . . . . . . . 50 4.1.5.2. Rollback CONTROL_INFO . . . . . . . . . . . . . . . . 51 4.1.5.3. Update Rejection CONTROL_INFO . . . . . . . . . . . . 51 4.1.5.4. CD Data Update Using Additional Content . . . . . . . 52 4.1.6. Compressed Message Format Using Template . . . . . . . 53 4.2. Special CD Data Management Message Formats . . . . . . . 54 4.2.1. General Formats . . . . . . . . . . . . . . . . . . . . 54 4.2.2. Template and Template Table Formats . . . . . . . . . . 59 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 62 6. Security Considerations . . . . . . . . . . . . . . . . . . . 63 7. Intellectual Property Right Considerations . . . . . . . . . . 63 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 64 9. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 66 Appendix A. Performance Results . . . . . . . . . . . . . . . . 67 A.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 67 A.2. Result Summary . . . . . . . . . . . . . . . . . . . . . 68 A.3. Detailed Results . . . . . . . . . . . . . . . . . . . . 69 Appendix B. Examples on CD Data Management Signalling . . . . . 71 B.1. Consistent Dynamic CD Data Maintenance . . . . . . . . . 71 B.2. Dynamic CD Maintenance Under Unreliable Path Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Appendix C. Application to SIP Signalling for Cellular Radio . . 82 C.1. Why SIP Compression for Cellular? . . . . . . . . . . . 82 C.2. Cellular Environment Assumptions . . . . . . . . . . . . 83 C.3. Cellular Environment Limitations . . . . . . . . . . . . 84 C.4. Possible Application of SCRIBE for Cellular . . . . . . 85 Liu, Le, Leung, and Clanton [Page iii] INTERNET-DRAFT SCRIBE 18 July 2001 1. Introduction This document defines an efficient, robust, and scalable scheme for message compression when the transmission path between the compressor and decompressor is unreliable, i.e. prone to errors, losses, and misordering. Message compression algorithms are applicable in particular to compression of messages from SIP (Session Initiation Protocol) [SIP], RTSP (Real Time Streaming Protocol) [RFC2326], SDP (Session Description Protocol) [RFC2327], and HTTP (Hypertext Transfer Protocol) [RFC2616]. A major case of application is the compression of SIP protocol messages over bandwidth limited links, such as cellular links. SCRIBE includes a number of mechanisms that improve compression efficiency. Many are aimed at the problem of compressing the first few messages involved in a session. In classical dictionary-based compression schemes, these messages would normally experience a lower compression ratio due to the availability of very limited dictionary information (e.g. protocol specific data or "static dictionary") in early stages of the session. SCRIBE alleviates this problem by maintaining CD (Compression/Decompression) data across sessions so that it can be used at the very beginning of the subsequent session to "jump start" the compression of initial messages. The CD data includes as well user-specific profile information. The scheme also defines a mechanism by which compression is achieved using pre- defined message templates. When there is a loss of synchronization between the compressor and decompressor CD data due to e.g. link conditions, the scheme recovers by "rolling back" to a "checkpoint" that is more populated than the static part of the CD data. This means that the scheme is not only robust to such synchronization loss, but it also has the capability to resume good compression efficiency characteristics immediately after such events. SCRIBE employs mechanisms that enable it to tolerate message misordering events between the compressor and decompressor as well. SCRIBE is scalable in the sense that it provides mechanisms that function regardless of memory size and/or operating environment. Memory size scalability can be critical, since the compression/decompression function could be employed by devices which may or may not have limited capability to store large amounts of CD data. SCRIBE can also be adjusted in order to accommodate different levels of message loss and misordering that may be experienced on the communication path between the compression/decompression functions. Liu, Le, Leung, and Clanton [Page 1] INTERNET-DRAFT SCRIBE 18 July 2001 Finally, SCRIBE's flexibility comes from the fact that it is a framework -- different mechanisms defined as a part of the SCRIBE framework can be used (or not used), depending on e.g. the needs of the application or the processing power available for use by the compressor/decompressor. Flexibility is aided through the use of SCRIBE profiles and feature/parameter negotiation. A SCRIBE profile includes a tuple of values that compressor and decompressor agree to use. The tuple of values defines the CD algorithm (used to compress/decompress a signalling message), CD data management signalling mechanisms (used to populate and update the CD data at the compressor/decompressor), and CD data update algorithm (used to update the CD data). The remainder of this document is structured as follows. Section 2 gives a conceptual framework of the proposed scheme. Section 3 provides a definition of SCRIBE protocols/procedures. Section 4 introduces a set of SCRIBE message formats used by the proposed scheme. And Section 5 concludes the main portion of the document. Performance results, illustrative examples, and a description of how the scheme might be applied to cellular radio systems, are provided in appendices at the end of the document. 2. Concepts and Framework 2.1. Algorithm Requirements and Assumptions SCRIBE fulfills all of the coexistence, efficiency, robustness, and performance requirements defined in [REQUIRE]. In this draft, a "CD entity" refers to an entity where the compression and/or decompression function resides. It consists of a compressor, or a decompressor, or both. When a compressor and a decompressor coexist in the same CD entity, they can be in split mode or shared mode. SCRIBE is grounded on the following assumptions: - The entities to be compressed and decompressed are the protocol messages (e.g. SIP messages). The algorithm at a CD entity takes as input CD data (e.g. a dictionary context) and a message to be compressed. It produces as output a compressed version of the message. The reverse operation is performed by the algorithm at another CD entity. The CD data at the compressor and decompressor must be identical. This is referred to as CD data synchronization or CD data coherency. Liu, Le, Leung, and Clanton [Page 2] INTERNET-DRAFT SCRIBE 18 July 2001 - A larger CD data likely leads to higher compression efficiency. That is, if CD Data A is a superset of CD Data B, the usage of CD Data A will generally result in a higher compression efficiency than the usage of CD Data B. - No assumption is made about the size of a message to be compressed. - Messages exchanged between the compressor and decompressor can be the compressed protocol messages, or signalling messages at the compressor/decompressor level, or a combination of both. - The transport between the compressor and decompressor is such that the compressed messages and signalling information are subject to losses, corruptions, and misordering. - There are two different ways to perform compression/decompression using CD data: split mode and shared mode. In split mode, a decompressor at one CD entity can only acknowledge received data from a compressor at another CD entity. A compressor and a decompressor at the same CD entity will use different CD data for compression and decompression, respectively. However, in shared mode, a compressor and decompressor at a single CD entity reside together such that the shared CD data can be used for both the compression and decompression processing. - Knowledge about the nature of the protocol messages to be compressed can improve the performance. 2.2. Robust Signalling Compression Framework The SCRIBE robust signalling compression framework consists of four main parts: - CD Algorithm (and associated Data Structure): This is the algorithm used to compress/decompress a signalling message. The algorithm takes as input the Compression/Decompression data (CD data), which could be e.g. data stored in a dictionary. For correct decompression, CD data at the compressor and decompressor have to be coherent. An example of the compression/decompression algorithm is Lempel-Ziv, which is based on string substitution. The CD data structure associated with the algorithm is also defined in this part. - CD Data Management Signalling: This is a collection of signalling mechanisms used to populate and update the CD data at the Liu, Le, Leung, and Clanton [Page 3] INTERNET-DRAFT SCRIBE 18 July 2001 compressor and decompressor. A major functionality of the signalling mechanisms is to provide robustness, i.e. maintain CD data coherency even under message losses, corruptions, and misordering. CD Data Management Signalling can be further subdivided into: * CD Data Population: This mechanism is invoked some time before the compressor and decompressor start compression/decompression. Signalling may take place between the compressor and decompressor, and/or involve a D-server (CD data server), where the S-profile and dynamic CD data is downloaded from. See Sections 2.7 and 2.3.3 for details about the S-profile and dynamic CD data, respectively. Although in this draft, we consider the case where the S-profile is downloaded from the D-server, other means to populate the S- profile at a CD entity are possible, including by preconfiguration. * CD Data Update Signalling: This signalling mechanism takes place between the compressor and decompressor during a compression/decompression session. The CD data can be updated from various sources, including the contents of the protocol messages themselves. The signalling ensures that the CD Data Update algorithm instances (see below) at the compressor and decompressor take as input the same updating data. CD Data Update Signalling also includes mechanisms needed in order to establish checkpoints and to roll back to those checkpoints. Checkpoint and rollback concepts are described in later sections. Schemes conforming to this framework may have CD Data Population only, CD Data Update only, or both. - CD Data Update algorithm: This is the algorithm used to update the CD data, taking as input the contents of the protocol message, and the CD data to be updated, and producing as output the updated CD data. - S-Profile Establishment: The CD algorithm, CD Data Management Signalling mechanisms, and CD Data Update algorithm, are attributes of a SCRIBE profile (S-profile) that the compressor and decompressor agree to use. S-profile establishment is the mechanism used to establish a common S-profile at the compressor and decompressor. The S-profile can be established by various means, including negotiation and/or preconfiguration. See Section 2.7 for the S-profile definition. Liu, Le, Leung, and Clanton [Page 4] INTERNET-DRAFT SCRIBE 18 July 2001 In this version of the draft, we focus on CD Data Management Signalling and specify its generic signalling mechanisms. We also specify a Template substitution CD algorithm. We define four S- profiles: 0, 1, 2, and 3. S-Profile 0 is a mandatory default scheme that MUST be implemented by all CD entities. 2.3. CD Data Organization Generally, the CD data consists of a static part, a profile-specific part, and a dynamic part. The CD data can be kept across sessions (also referred to as the "cross-session" CD data). The compressor and decompressor use some temporary memory locations during processing, as well. 2.3.1. Static Data The static part of the CD data is the part which can be populated simply from knowing what protocol is to be compressed. For example, if the protocol is SIP, the static part can include co-strings and/or literals associated with the protocol, such as "From:", "To:", etc. The static data part represents the "least common denominator" that can be used when compressing messages. It does not change over time, is relevant for all users of the protocol, and can be applied to all sessions. In general, when no other CD data is known by both CD entities involved in the compression and decompression processes, we are ensured that at least this part of the CD data can be applied. 2.3.2. Profile-Specific Part The profile-specific part of the CD data is the part which can be populated using information specific to a user. It is relevant for all sessions involving a particular user, and may change over time (but probably on an infrequent basis). The profile-specific part consists of a profile-specific dictionary and a template table, each of which may be empty. The profile- specific dictionary is a string of bytes with no internal structure. It can be used by string substitution compression algorithms. The template table can also be modelled as a dictionary (therefore it is usable by string substitution algorithms), but in addition, a template table has an internal structure that can be used by template substitution algorithms to further boost the compression efficiency. The user specific information may be stored in the profile-specific dictionary or in the template table. Liu, Le, Leung, and Clanton [Page 5] INTERNET-DRAFT SCRIBE 18 July 2001 2.3.2.1. Profile-Specific Dictionary The concept of profile-specific dictionary is based on the following observation that for protocols such as SIP, a given user/device combination will produce some messages containing fields that are always populated with the same data. For example, if we consider SIP, capabilities of the SIP endpoints are communicated during session initiation, and do not tend to change unless the device's capabilities change. Similarly, user-specific information such as a user's URL, a user's name, and a user's e-mail address likely won't change on frequent basis, and will appear regularly in SIP signalling exchanges involving a specific user. The profile-specific dictionary can be derived from a user profile before a session starts -- it might be obtained by downloading from some server, i.e. D-server. Following the downloading, the profile- specific dictionary can be used immediately to compress/decompress the protocol messages. The profile-specific dictionary can be used to improve the compression efficiency (particularly the first few messages in a session) compared to cases where e.g. only using the static CD data. See Appendix A for performance when the static and profile-specific CD data are used. 2.3.2.2. Template Table A template table is a list of templates. Logically, a template consists of an ordered list of fields. Each of them could be a static field (i.e. a fixed-length string) or a blank field to be filled, under the constraint that no adjacent fields can be of the same type. Each template is assigned a unique TID. The logical format of a template should be the uncompressed template format as defined in Section 4.2.2. How a template is stored physically at each CD entity is an implementation issue, as long as the logical content of a template is agreed by the two CD entities. 2.3.3. Dynamic Data The dynamic part of the CD data is neither static nor profile- specific. It is populated by making use of messages sent from/to a user. Liu, Le, Leung, and Clanton [Page 6] INTERNET-DRAFT SCRIBE 18 July 2001 The idea is to take advantage of the fact that messages and/or portions of messages associated with protocols such as SIP tend to have similar characteristics within a session, and from session to session. By keeping information across sessions, we can improve the compression ratio achievable in subsequent sessions. The concept is similar to that of the profile-specific dictionary, only the information kept need not to be user or device specific. E.g., dynamic CD data might be associated with a frequently contacted user. Its contents are determined in large part by a CD data update algorithm. Compressor and decompressor must ensure that their dynamic CD data are the same prior to compression/decompression. A mechanism for doing this (state verification) is described in the draft. In some cases, it may be necessary for a CD entity to download the needed dynamic CD data prior to compression/decompression (data downloading). 2.3.4. Temporary Data Stores and CD Data Construction When the dynamic part is updated with some information, to prevent incoherent CD data, care must be taken to ensure that both CD entities have the information, and the updates are performed in the same order at both CD entities. To assist in this process, SCRIBE utilizes three temporary data storage areas which aid the normal operation of the scheme. Temporary Sender Storage (TSS): It is a temporary storage for the update requests sent to a receiver. An update request specifies an operation to be performed, along with the data used for the update. Note that the operation may be implicit. For example, if both CD entities have agreed (by implicit or explicit negotiation) that the operation is to "append the data", it is sufficient to send only the data to be appended. The update requests are kept in the TSS until an indication comes from the receiver indicating a correct reception of the update request. When that indication comes, the update request is eligible to be applied to update the dynamic CD data at the sender. Temporary Receiver Storage (TRS): It is a temporary storage for the update requests received. An update request is kept in the TRS until an indication comes from the sender indicating a correct reception of the indication to the reception of the captioned update request. When that indication arrives, the update request is eligible to be applied to the dynamic CD data at the receiver. Liu, Le, Leung, and Clanton [Page 7] INTERNET-DRAFT SCRIBE 18 July 2001 Temporary Receiver Storage Table (TRST): In order to keep track of which indications correspond to which compressed messages at the receiver, a mapping of indication (e.g. acknowledgement) corresponding to a given compressed message received is maintained in this table. Use of these storage areas will become clearer in later sections which directly illustrate their functions. For the purpose of compression, the CD data can be viewed as consisting of the following parts, in the order shown: * the static part of the CD data; * the profile-specific part of the CD data; and * the dynamic part of the CD data, updated with all the update requests that are eligible in the temporary sender storage, and all update requests held in the temporary receiver storage. The updates are applied in the order specified by their total order relationship (see Section 2.5.3). The updates are temporary in the sense that the content of the CD data before updates is not lost. For the purpose of decompression, the CD data can be viewed as consisting of the following parts, in the order shown: * the static part of the CD data; * the profile-specific part of the CD data; and * the dynamic part of the CD data, updated with some update requests that are eligible in the temporary sender storage or the temporary receiver storage. The set of required updates can be inferred directly from the acknowledgements piggybacked with the compressed message. These updates are ordered according to their total order relationship. The updates are temporary in the sense that the content of the CD data before updates is not lost. There is an exception that an empty dynamic part of the CD data is used instead when a CD entity has already acknowledged the reception of a ROLL_INIT (see Section 3.4) but an acknowledgement to such an acknowledgement has not yet received. 2.4. CD Algorithm The following text describes the idea by using SIP as a concrete example. However, it should be noted that the model is applicable to Liu, Le, Leung, and Clanton [Page 8] INTERNET-DRAFT SCRIBE 18 July 2001 compressing any protocol messages. SIP is a text-oriented application layer signalling protocol to set up, maintain, and terminate a call. Readers can refer to [SIP] for details. 2.4.1. Dictionary-Based Algorithm A dictionary-based compression algorithm can be used to reduce the network bandwidth consumption of signalling messages, such as SIP messages. Figure 1 exhibits the message compression framework. There are two CD entities on the SIP message travelling path, designated as an E-ms and an E-net. Each CD entity consists of a compressor or a decompressor or both. When a SIP message, say, going from left to right is received by the compressor in an E-ms, it will be compressed using the CD data located in the E-ms according to Lempel-Ziv or other similar compression algorithms. These algorithms work by scanning through a file containing the message from left to right and replacing repeated strings by pointers, each of which is in the form of an offset and a length of match, to the last previous occurrence in the file. Provided that the CD data stored in a compressor at the E-ms and a decompressor at the E-net are coherent, the E-net can reconstruct an original message from the compressed one by applying the same CD data as the one E-ms used for compressing that particular message. An approach to describe how to construct a compressor/decompressor CD data is discussed in Section 2.3.4. Liu, Le, Leung, and Clanton [Page 9] INTERNET-DRAFT SCRIBE 18 July 2001 +------------------+ +--------------------+ | +--------------+ | | +----------------+ | | | CD Data | | | | CD Data | | | +--------------+ | | +----------------+ | | | | | | | | | original | | | | compressed | | | | message | +----------+ | | message | +------------+ | | --------------+>|Compressor|-) - + - - - - - -+>|Decompressor|-)---+----> | +----------+ | | | +------------+ | | | | | | | | decompressed | | | compressed | | | message | +--------------+ | message | +----------------+ | <-------------+-| Decompressor |<+- - - - - - +-| Compressor |<+----- | +--------------+ | | +----------------+ | +------------------+ +--------------------+ E-ms E-net Figure 1. Message Compression Framework. 2.4.2. Template-Based Algorithm Although the dictionary-based compression can yield a good compression efficiency, the efficiency can be further boosted by using "template-based" compression. The idea is that the compressor and decompressor maintain a set of templates, each consisting of static fields interspersed with blank fields. Each template is assigned a template identifier (TID). A string can be compressed using the best matched template to yield the highest compression efficiency. How it is done is an implementation issue. The result will be the TID of the template followed by fields carrying the data used to populate the blank fields in the template. The fields could be sub-strings of the original string that correspond to the blank fields in the template, or some compressed forms of the sub-strings. Figure 2 exhibits an example of the template-based encoding. The template has four static fields (ss0, ss2, ss4, and ss6) and three blank fields (b1, b3, and b5). It is assumed that all blank fields have variable lengths. For simplicity, the filling sub-strings (ss1, ss3, and ss5) are encoded in the form of . Again for simplicity, it is assumed that the value is the uncompressed form of the original sub-string. Liu, Le, Leung, and Clanton [Page 10] INTERNET-DRAFT SCRIBE 18 July 2001 A template +----------------------------------------------------+ | TID | ss0 | b1 | ss2 | b3 | ss4 | b5 | ss6 | +----------------------------------------------------+ A string to be compressed +-------------------------------------------------------+ | ss0 | ss1 | ss2 | ss3 | ss4 | ss5 | ss6 | +-------------------------------------------------------+ Compressed string using the template +--------------------------------------------+ | TID | L1 | ss1 | L3 | ss3 | L5 | ss5 | +--------------------------------------------+ TID: Template ID ss: Sub-string b: Blank field L: Length Figure 2. An Example of Template-Based Encoding. When compared with dictionary-based compression, template-based compression can achieve a higher compression efficiency, as the position and length information of the static parts have already been stored in a template, and do not need to be transferred. The template scheme is applied at the message level in this draft. One example application could be the INVITE messages emitted from a SIP terminal from session to session. Although the set of templates can be treated as the dynamic part of the CD data, a static set of templates is proposed in this draft for simplicity. The set of templates is a part of a user profile, and can be downloaded as a part of the profile-specific CD data before a session commences. 2.5. CD Data Management Signalling SCRIBE CD data management signalling is devised based on the following core concepts: state verification and data downloading, sliding handshake, total ordering of updates, checkpoint, and rollback. These concepts are discussed in the following subsections. Liu, Le, Leung, and Clanton [Page 11] INTERNET-DRAFT SCRIBE 18 July 2001 2.5.1. State Verification and Data Downloading When the CD data are maintained across sessions, it is necessary to have a mechanism for a pair of CD entities, i.e. E-ms and E-net, to verify that they have the same S-profile and CD data before or when a session begins. Such a mechanism is known as the state verification. In addition, another mechanism, known as the data downloading, is needed to enable E-net to download the S-profile and CD data from a D-server if it does not have one for E-ms. Of course, as the last resort, E-ms and E-net can always re- synchronize their S-profile and CD data via the bandwidth limited link connecting them. All this is done preferably before a call is set up, but this is not a mandatory requirement. 2.5.2. Sliding Handshake Each message (protocol or signalling) carries a sequence number. A message can be acknowledged. Acknowledgements are cumulative and selective. An acknowledgement is cumulative in the sense that multiple messages, e.g. Messages 5, 6, 7, and 8, can be acknowledged at the same time. It is selective because one can specify Messages 5, 6, and 8 are acknowledged, but not Message 7. An acknowledgement only needs to carry the sequence number of Message m to acknowledge a set of (m - n + 1) messages consecutively numbered from Message n to Message m, where Message (n - 1) is the last message whose acknowledgement is known to be received. A bit mask is sent when the messages to be acknowledged are not consecutively numbered, and indicates which messages have or have not been received. Most of the time, it is expected that messages are consecutively numbered, since non-consecutive numbering happens when messages are lost or misordered. Whenever an update request, which includes a message to update the dynamic CD data (DD), is sent from a CD entity, a copy of the update request is stored in a place called the temporary sender storage (TSS). When another CD entity receives the update request, it decompresses the message and the update request will be stored in a repository called the temporary receiver storage (TRS). An acknowledgement with respect to the arrival of the update request will be sent to the sender, and a mapping between the sequence number of the acknowledgement and that of the update request will be maintained in the temporary receiver storage table (TRST). When the sender receives the acknowledgement, the update request will become eligible and it will be applied to update the dynamic CD data based on the total order relationship of all known update requests (see Liu, Le, Leung, and Clanton [Page 12] INTERNET-DRAFT SCRIBE 18 July 2001 Section 2.5.3). An acknowledgement towards the reception of the acknowledgement of the update request is then sent to the receiver. Once that acknowledgement is received, the corresponding update request will become eligible and it will be applied to update the dynamic CD data based on the total order relationship of all known update requests. The sliding handshake procedure for the captioned message is therefore completed. Message misordering can be inferred from the fact that there is a gap in message sequence numbers is found. For example, the arrival of Message 1 followed by that of Message 3 to a CD entity indicates that Message 2 is either lost in transit or being misordered. Whenever such a misordered message arrival is suspected (i.e. the arrival of Message 3), updates to the CD data are deferred till the gap is filled (i.e. Message 2 is either received by the CD entity or considered lost). Furthermore, if a compressed message piggybacked with an acknowledgement to an update request and the update request has already been discarded, the compressed message must not be decompressed but discarded. 2.5.3. Total Order The total ordering of events in a distributed system was first investigated in [ORDER]. The paper defines the "happened before" relation, which gives a partial ordering of system events. Then, a system of logical clocks and the associated distributed algorithm are introduced to place a total ordering on the set of all system events. The problem of coherent CD data update described in this draft falls in the same area. However, it does not address the peculiarities of the problem. For example, messages transferred on the channel between two CD entities can be lost and/or misordered. This has to be solved by a combination of total ordering and the sliding three- way handshake mechanisms. Furthermore, logical clocks as described in [ORDER] are not used in this draft. Instead, a slightly different algorithm, which is tailored to this specific problem, is used to achieve an optimal solution. See Section 3.2.2 for details. 2.5.4. Checkpoint The checkpoint establishment procedure aims to verify that the CD entities have a coherent dynamic CD data. If this is the case, the dynamic CD data is stored for future reference, and the stored value is referred to as a checkpoint. Verification is done by using a strong CRC (Cyclic Redundant Code) calculation over the dynamic CD data. A version number is used to differentiate checkpoints established at different points in time. Liu, Le, Leung, and Clanton [Page 13] INTERNET-DRAFT SCRIBE 18 July 2001 2.5.5. Rollback At any point in time, a CD entity can initiate a rollback procedure, where both CD entities rollback in a synchronized fashion to a checkpoint specified with its version number. Rollback can be triggered when, for example, a CD data error or desynchronization is detected. 2.6. CD Data Update Algorithm In order to achieve a good compression efficiency, the dynamic CD data of the two CD entities may need to be updated in a consistent way. This is done through message exchanges. Various mechanisms are used in this document to meet the typical challenges in a distributed system, such as message losses, corruptions, misordering, concurrency, etc. The above message exchange mechanisms can support arbitrary update semantics. However, both CD entities must agree on the update protocol in advance. In general, an update may trigger an addition of content to the CD data, a deletion of content from the CD data, or any combinations of the two. The update semantics obviously depend on and are restricted by the structure of the CD data. To make a protocol out of infinite choices, in this version of the draft, we first assume the dynamic CD data to be a string of octets. Two update operations are used: addition, i.e. appending a message to the end of the dynamic CD data; and implicit deletion, deleting the oldest part of the dynamic CD data if the resulting dynamic CD data exceeds a maximum size. Besides these two "normal" update operations, a rollback procedure is also defined to handle the abnormal case where the current dynamic CD data is corrupted. The above update operation can be enhanced in many ways, with the cost of additional complexity. For example, a buffer of uncompressed strings is not the most efficient CD data structure. This drawback can be compensated for by including an improved addition operation, i.e. only appending strings that cannot be compressed using the current dynamic CD data. Another possible improvement is explicit deletion, which means one side explicitly signals the other side to delete certain portions of the dynamic CD data. 2.7. S-Profile An S-profile specifies the algorithms/mechanisms supported by a CD entity. An S-profile also includes the contents of those parts of Liu, Le, Leung, and Clanton [Page 14] INTERNET-DRAFT SCRIBE 18 July 2001 the CD data that can be derived from the profile-specific part of the CD data. An S-profile is encoded as a tuple of values, followed by the contents of the profile-specific CD data. The tuple of values defines the following attributes: - CD Algorithm and Associated Data Structures (CDA); - CD Data Management Signalling (CDDMS); and - CD Data Update Algorithm (CDDUA) (applicable only for some values of CDDMS). There is a default S-profile (S-Profile 0, defined below) that all CD entities MUST implement. CD Algorithm and Associated Data Structures ------------------------------------------- Possible values are 1: String substitution, a la Lempel-Ziv 2: Template substitution 3: Combination of 1 and 2 Other values are possible in the future. CD Data Management Signalling ----------------------------- Possible values are 1: CD data population by downloading from a D-server, no CD data update 2: CD data population by preconfiguration, no CD data update 3: CD data update only, no CD data population 4: CD data population by downloading from a D-server and CD data update 5: CD data population by preconfiguration and CD data update Liu, Le, Leung, and Clanton [Page 15] INTERNET-DRAFT SCRIBE 18 July 2001 CD Data Update Algorithm ------------------------ This attribute is applicable only when the CDDMS attribute has values 3, 4, or 5. Possible values are 1: Message is appended to the dynamic CD data. Oldest items are deleted to make room for the new message (implicit delete). An associated parameter is the implicit maximum size of the dynamic CD data. Other values are possible in the future. In this version of the draft, we consider Profiles 0, 1, 2, and 3, which have the following attributes. Profile Number CDA CDDMS CDDUA ----------------------------------------- 0 1 2 N/A 1 3 2 N/A 2 3 4 1 3 3 1 N/A S-Profile 0 is the default profile. CD data is preconfigured to be the static part of the CD data. All entities MUST implement S- Profile 0. 2.8. SCRIBE Message A SCRIBE message has a control part, and may or may not carry a compressed message. A compressed message is a signalling protocol message in its compressed form (note that in the "compressed form", the message may in reality be uncompressed). The control part carries signalling information between the SCRIBE compressor and decompressor, e.g. for the purpose of CD data management. When a compressed message is present, the control part may also carry the information needed to decompress, e.g. compression method used. Liu, Le, Leung, and Clanton [Page 16] INTERNET-DRAFT SCRIBE 18 July 2001 3. Procedures This section describes the protocol procedures, which include the state verification and data downloading procedure, dynamic CD data maintenance, checkpoint establishment procedure, rollback procedure, template related procedures, and implicit CD data content deletion. A set of state diagrams are included to illustrate the interactions of some of these procedures. 3.1. State Verification and Data Downloading In the following description of the state verification and data downloading procedure, the first step, i.e. Step K, can be piggybacked on a SIP registration message. However, the same procedure can be applied to the compression of other signalling protocol messages. Note that the procedure is robust to message losses and misordering. Figure 7 shows the concepts and message flows involved during the procedure. It is assumed that a D-server maintains the S-profile, along with the dynamic part of the CD data, if any, associated with the E-ms. The State Verification Procedure proceeds as follows: K) State Advertisement Procedure: The E-ms sends a state advertisement, called STATE_ADV, to the current E-net to advertise its S-profile and CD data. The message contains the version number of a checkpoint proposed by the E-ms, along with strong CRCs calculated over relevant data contents. After sending STATE_ADV, the E-ms starts a timer (TO-3) associated with the message. After timeout, the E-ms can repeat Step K up to a certain limit. The timer will be stopped when the E-ms receives STATE_ADV_ACK from E-net (see Step N below). L) State Advertisement Acknowledgement Procedure: Upon receiving a STATE_ADV, the E-net checks if it has the checkpoint with the version number advertised by the E-ms, and if the CRCs pass verification. If so, it goes to Step M. If not, the E-net can attempt to obtain the data identified by the version number by some means, e.g. by using the Data Downloading Procedure. If the procedure fails, it can always fall back to a default profile. In any event, it goes to Step M. M) The E-net sends a STATE_ADV_ACK back to the E-ms, with a version number. The version number is either the one originally proposed by the E-ms (if the E-net has the version of the checkpoint or was successful in obtaining it), or the one of the Liu, Le, Leung, and Clanton [Page 17] INTERNET-DRAFT SCRIBE 18 July 2001 default profile. The E-net also starts a timer (TO-5) after sending a STATE_ADV_ACK. If timeout occurs, the E-net can repeat the message up to a certain number of retransmissions. The timer will be stopped when it receives the state advertisement completion, called STATE_ADV_DONE (see Step N). N) State Advertisement Done Procedure: Upon receiving a STATE_ADV_ACK from the E-net, the E-ms will stop the TO-3. It verifies that the version number returned is compatible with its capabilities. If so, that version number will determine the S- profile along with the dynamic part of the CD data, that will be used between the E-ms and E-net. A STATE_ADV_DONE is then sent to the E-net to finish the three-way handshake. The Data Downloading Procedure proceeds as follows: O) Downloading Request Procedure: The E-net sends a downloading request, called DOWNLOAD_REQ, to a D-server to request for the checkpoint with the version number advertised in the STATE_ADV message. The corresponding CRCs (originally stored in the STATE_ADV) are also included in the DOWNLOAD_REQ. A globally unique user identifier is also included in the message to tell the D-server whose information is being requested. It is assumed that the E-net can derive the user identifier itself, such as based on the mapping from the device identifier of the E-ms. Otherwise, the E-ms can explicitly send its user identifier in the STATE_ADV. After sending the DOWNLOAD_REQ, the E-net starts a timer (TO-4). When the timer expires, it can repeat this step to resend the DOWNLOAD_REQ and restart the timer. The timer will be stopped when the E-net receives a downloading response, called DOWNLOAD_REP, from the D-server (see Step P below). The E-net exits the procedure with a failure indication when a number of DOWNLOAD_REQs have been sent. P) Downloading Response Procedure: After receiving a DOWNLOAD_REQ, the D-server searches the S-profile and CD data associated with the user identifier. It will check the CRCs in the DOWNLOAD_REQ message against the ones maintained locally. If the CRC check succeeds, it returns the information in a downloading response, called DOWNLOAD_REP, to the E-net. Q) Downloading Done Procedure: On the receiver side, the E-net will stop the TO-4 and store each item received in a DOWNLOAD_REP and exit the procedure with a success indication. Liu, Le, Leung, and Clanton [Page 18] INTERNET-DRAFT SCRIBE 18 July 2001 At the cost of some additional memory in the E-net, a cache could be used to maintain any information that is obtained from the D-server. Cached information might be maintained by the E- net for some finite amount of time and removed when e.g. - some timer expires; - some memory limitation is reached; or - some other criteria are met; e.g. some algorithm could be used to determine which data to insert or keep in the cache. Caching would minimize the number of E-net/D-server transactions and would result in - avoiding the downloading of the S-profile and/or checkpoint from the D-server in some cases; the state verification and data downloading procedure would be faster, since the downloading might not need to take place; and - faster transactions between the E-net and D-server when they do need to take place, since there should be fewer of these transactions. Note: To handle possible residual errors on the channel between the E-net and D-server, the E-net can optionally calculate the CRC for each data item received and check it against the corresponding CRC received from the E-ms. If the CRC check of the S-profile fails, all items will be discarded. Otherwise, if the CRC check of the checkpoint fails, the checkpoint will be discarded. Liu, Le, Leung, and Clanton [Page 19] INTERNET-DRAFT SCRIBE 18 July 2001 | | | (K) | STATE_ADV | | TO-3 starts |----------------------->| | | (L) | | | | | | (O) | DOWNLOAD_REQ | | TO-4 starts |----------------------->| | | | | | DOWNLOAD_REP | (P) | TO-4 stops |<-----------------------| | | (Q) | | | | | STATE_ADV_ACK | (M) | TO-3 stops |<-----------------------| TO-5 starts | | | | (N) | STATE_ADV_DONE | | |----------------------->| TO-5 stops | | | | E-ms E-net D-server Figure 7. State Verification and Data Downloading Procedure. 3.2. Dynamic CD Data Maintenance As noted before, the same view in the CD data on both CD entities is needed to allow a decompressor to recover a compressed message correctly. A CD data update request can be sent whenever a CD entity sends a protocol message, and decides that it wants to update the shared dynamic part of the CD data with that message. Here "update the dynamic part of the CD data with a message" means that the CD entity applies an algorithm taking as input the dynamic part of the CD data to be updated and the message. The output of the algorithm is the updated dynamic part of the CD data. For example, an update may entail appending a message to the CD data. But our proposal allows for other update procedures. Furthermore, a CD entity has an option to reject a CD data update which was initiated either by its peer or by itself (see Section 3.2.3). We assume that a CD data update cannot be undone. This non- revocation assumption is realistic since there is by no means to identify which update request an item in the CD data corresponds to unless some substantial overheads are needed to store tracking data. Our proposed CD data maintenance mechanism consists of keeping a set of temporary storages maintained at both CD entities and locally maintaining a total order relationship that are used to ensure the CD Liu, Le, Leung, and Clanton [Page 20] INTERNET-DRAFT SCRIBE 18 July 2001 data updates are applied in the same order at both CD entities, in addition to an enhanced three-way sliding handshake to deal with message losses or misordering. This guarantees CD data coherency. The maintenance of the total order relationship is essential to resolve potential conflicts in CD data updates whenever concurrent CD data updates occur. The concurrent CD data update condition is detected whenever a CD entity receives a CD data update request from its peer CD entity while it has an outstanding, unacknowledged update request. This means that it can be detected by a CD entity by the reception of a message that does not acknowledge all messages sent by the CD entity. Without the loss of generality, we define a CD entity, from which update requests are ordered before those initiated by its peer CD entity in the total order relationship for a sequence of CD data updates when concurrent updates are detected as a master CD entity, and its peer as a slave CD entity. The role (master/slave) of a CD entity can be chosen by a variety of ways. Some of them are listed as follows: * by a "static" assignment; * by defining at the commencement of a communication session; or * by re-defining during a session, e.g. at the rollback procedure. The rollback procedure is discussed in Section 3.4. The remaining parts of this sub-section are organized as follows. An enhanced three-way sliding handshake procedure is described. Then, a distributed approach to construct a total order relationship is investigated, and how to use the total order relationship to ensure the CD data updates are applied in the same order at each CD entity is discussed. 3.2.1. Sliding Handshake The function of the sliding handshake mechanism is to convey a signal that the shared dynamic part of the CD data is to be updated with an included message. It can then be used to assist in coherent CD data updates for messages transmitted over an unreliable reordering channel between a pair of CD entities. In this proposal, an enhanced three-way handshake mechanism is used. The mechanism is exhibited in Figure 3. Liu, Le, Leung, and Clanton [Page 21] INTERNET-DRAFT SCRIBE 18 July 2001 | | (A) | UPDATE_INIT | |------------------------------->| | | | UPDATE_ACK | (B) |<-------------------------------| | | (C) | UPDATE_ACK_ACK | |------------------------------->| | | (D) | | CD Entity X CD Entity Y Figure 3. Sliding Handshake for CD Data Maintenance. Below, an UPDATE_INIT is a single-bit information element that indicates the sender of a message wants to use the message to update the dynamic CD data. The sequence number of an UPDATE_INIT is that of the SCRIBE message containing that UPDATE_INIT. An UPDATE_INIT can also carry the information about the exact update algorithm. In this draft, we assume the default update algorithm (agreed upon by both CD entities) is used. Therefore, an UPDATE_INIT maps to the U- bit (see Section 4.1). This makes the protocol simple. However, the concepts and associated mechanisms described in this section can be easily extended to support more functionality, at a cost of additional complexity. Suppose a message (called M), which is used to assist future compression/decompression, is sent from CD Entity X to CD Entity Y. CD Entity X will perform the following steps: A1) Compress M using the compression CD data, as described in Section 2.3.4. The compressed message is denoted as CM, which is then transmitted to CD Entity Y. A2) If CD Entity X wants to update the CD data with M, it sends an update request, called UPDATE_INIT, piggybacked to the header of CM. A3) A copy of M is stored in the temporary sender storage (TSS). A4) Whenever allowed, piggyback acknowledgement, if any. When CD Entity Y receives CM, it will do the following: Liu, Le, Leung, and Clanton [Page 22] INTERNET-DRAFT SCRIBE 18 July 2001 B1) Decompress CM using the decompression CD data, as described in Section 2.3.4. The decompressed message is denoted as M'. Note that M' should be the same as M for a successful decompression. B2) Perform a CRC check with M'. If it fails the CRC check, discard the message and stop. Otherwise, continue to do Step B3. B3) If M' will be employed for subsequent compression/decompression (as indicated by CD Entity X), a copy of the UPDATE_INIT with M' is stored in the temporary receiver storage (TRS). B4) If an UPDATE_INIT is piggybacked with CM, and CD Entity Y determines it also wants to update the CD data, CD Entity Y sends a update acknowledgement (possibly delayed), called UPDATE_ACK. B5) A mapping between the sequence number of the UPDATE_ACK and that of CM is stored in the temporary receiver storage table (TRST). Note that an UPDATE_ACK may be piggybacked to the header of a message sent from CD Entity Y, and an acknowledgement can be cumulative. When CD Entity X receives an UPDATE_ACK, it will do the following steps: C1) Perform a CRC check with the UPDATE_ACK. If the CRC check is failed, discard the message and stop. Otherwise, continue to do Step C2. C2) The request corresponding to the UPDATE_ACK (in this case, the UPDATE_INIT of M) becomes eligible. CD Entity X uses M to update the dynamic part of the CD data. C3) Send an acknowledgement (possibly delayed), called UPDATE_ACK_ACK, to CD Entity Y about the reception of the UPDATE_ACK. Note that UPDATE_ACK_ACK may be piggybacked to the header of a message sent from CD Entity X, and an acknowledgement can be cumulative. When CD Entity Y receives an UPDATE_ACK_ACK, it will do the following steps: D1) Perform a CRC check with the UPDATE_ACK_ACK. If the CRC check is failed, discard the message and stop. Otherwise, continue to do Step D2. D2) Retrieve a mapping between the acknowledged sequence number and a list of sequence numbers that correspond to some requests Liu, Le, Leung, and Clanton [Page 23] INTERNET-DRAFT SCRIBE 18 July 2001 stored in the temporary receiver storage. Associated requests and messages (including M' in this example), of which the sequence numbers match with those in the list(s), become eligible. Corresponding updates to the dynamic part of the CD data are then applied. Note that, in the actual protocol implementations, both UPDATE_ACK and UPDATE_ACK_ACK MAY be piggybacked with a compressed message and they MUST share a single codepoint (i.e. ack_seq field) in the SCRIBE message header to achieve a higher encoding efficiency. In addition, whenever a CD entity knows that its peer CD entity receives a particular acknowledgement (by receiving an acknowledgement to that acknowledgement from the peer), it SHOULD stop sending that acknowledgement because of the non-revoking nature of acknowledgements. 3.2.2. Specification of Ordering Constraints An inconsistent view in the CD data at both CD entities can happen when shared CD data update requests, i.e. UPDATE_INITs, are sent concurrently by both CD entities. An approach to deal with this problem is to specify ordering constraints of all UPDATE_INITs sent by both CD entities so that the order is total and the same at both CD entities, at least for the eligible UPDATE_INITs. If the CD data updates are actually carried out, they are performed in the order specified by the ordering of the corresponding UPDATE_INITs. A distributed approach to construct such a total order relationship of a sequence of all known UPDATE_INITs is discussed as follows: * Local rule: When a CD data update request R' is sent before R" at the same CD entity, R' is scheduled to perform an update before R" at both CD entities. A sender CD entity schedules its own update requests according to their sequence numbers, whereas a receiver CD entity schedules these requests according to the request sequence numbers, which can be inferred from the sequence numbers of those SCRIBE messages containing these requests. * Causality rule: When a SCRIBE message containing a CD data update request R' piggybacks with acknowledgements, i.e. UPDATE_ACKs, of a set of update requests initiated by another CD entity, R' is scheduled to perform an update after all the updates corresponding to the acknowledged update requests. * Concurrent rule: This rule is applied by a CD entity when concurrent CD data updates are detected. Let SN(R) be the sequence number of Request R; HSN(R) be the highest sequence number of an update request, i.e. UPDATE_INIT, received by the Liu, Le, Leung, and Clanton [Page 24] INTERNET-DRAFT SCRIBE 18 July 2001 sender CD entity of Request R. Suppose R' and R" are update requests sent by CD Entities X and Y respectively. Upon the arrival of R", CD Entity X detects concurrency between R' and R" when SN(R') is greater than HSN(R") and SN(R") is greater than HSN(R'). The update schedule is defined as follows: - if CD Entity X is a master CD entity, R' is scheduled to perform an update before R"; and - if CD Entity X is a slave CD entity, R" is scheduled to perform an update before R'. The remaining of this sub-section describes an algorithm on how to implement these constraints. The idea is that a master CD entity determines the total order list (TOL), and a slave CD entity replicates the TOL by analyzing the signalling from the master CD entity. The master CD entity orders the update requests according to the time they are received or sent, with an adjustment for misordering based on the sequence number. The slave CD entity infers the same order from analyzing the sequence number and the HSN of messages received from the master CD entity. It is relatively easy to see that the algorithm can be used as the basis to prove that the order is total, and the same at both CD entities. For simplicity, it is assumed that all messages carry update request, but it is relatively easy to see that the algorithm described here can be extended in a straightforward manner to the case where not all messages carry update requests. Let CHS be the current highest sequence number of update requests received, SN(R) be the sequence number of Request R, REQ(N) be the update request with the sequence number N, HSN(R) be the value of CHS at the sender CD entity when R was sent, Succ(R) be the update request received with the smallest sequence number greater than SN(R), Pred(R) be the update request received with the largest sequence number smaller than SN(R), and Last(TOL) be the last element in the TOL. Initially, the TOLs at both CD entities are empty. The algorithm is described as follows. Liu, Le, Leung, and Clanton [Page 25] INTERNET-DRAFT SCRIBE 18 July 2001 At a master CD entity: ====================== Receive(R'): If TOL is empty or SN(R') > CHS Append R' to the TOL; CHS <- SN(R'); Else If SN(R') < CHS ; case of misordering from a slave ; CD entity to a master CD entity Insert R' in the TOL right before Succ(R'); Else ; error condition Discard R'. Send(R"): Append R'' to the TOL. Liu, Le, Leung, and Clanton [Page 26] INTERNET-DRAFT SCRIBE 18 July 2001 At a slave CD entity: ===================== Receive(R'): If TOL is empty TOL <- [REQ(INIT_SN), REQ(INIT_SN + 1), ..., REQ(HSN(R')), R']; ; INIT_SN is the initial sequence number Else if SN(R') < SN(Last(TOL)) ; case of misordering from a master ; CD entity to a slave CD entity If HSN(R') is HSN(Succ(R')) Insert R' right before Succ(R'); Else if HSN(R') < HSN(Succ(R')) If HSN(R') > HSN(Pred(R')) Order requests in the TOL according to the order as HSN(Pred(R')) + 1, ..., HSN(R'), SN(R'), HSN(R') + 1, ..., HSN(Succ(R')), Succ(R'); Else if HSN(R') is HSN(Pred(R')) Order requests in the TOL according to the order as SN(R'), HSN(R') + 1, ..., HSN(Succ(R')), Succ(R'); Else ; error condition Discard R'; Else ; error condition Discard R'; Else if SN(R') > SN(Last(TOL)) If HSN(R') is HSN(Last(TOL)) Append R' to the TOL; Else if HSN(R') > HSN(Last(TOL)) Order requests in the TOL according to the order as HSN(Last(TOL)) + 1, ..., HSN(R'), SN(R'); Else ; error condition Discard R'; Else ; error condition Discard R'. 3.2.3. Dynamic CD Data Update Mechanism This sub-section describes how the previously described total order and the three-way handshake are applied to update the dynamic part of the CD data in a consistent manner. First, we define the terminology for CD data updates. A known CD data update to a CD entity is an update that is initiated or received by the CD entity, or its existence can be inferred by some other means, say a gap in sequence numbers of a set of arrived update requests. An update request, i.e. UPDATE_INIT, becomes eligible at its sender CD entity whenever the CD Liu, Le, Leung, and Clanton [Page 27] INTERNET-DRAFT SCRIBE 18 July 2001 entity knows that the receiver CD entity has received it and accepted to proceed with the three-way handshake leading to the update, based on the reception of the corresponding acknowledgement, i.e. UPDATE_ACK. An UPDATE_INIT becomes eligible at its receiver CD entity whenever the CD entity has sent an UPDATE_ACK, and knows that the sender CD entity has received that UPDATE_ACK, based on the reception of the corresponding UPDATE_ACK_ACK. If an update request cannot be carried out because some other update requests preceding it (based on the total order) are not ready to be carried out, this request is considered blocked, regardless of whether it is eligible or not. Whenever the request is eligible, and all requests preceding that request are either eligible or considered lost or rejected, it becomes ready, and the corresponding update can be applied. An update request is considered lost at the receiver CD entity whenever its associated receiver timer (R-timer) is expired. One way to manage the R-timer associated with Message N is to start it at the first instance of the CD entity receiving some Message M, where M > N, and message N is not yet received. Optimal usages of timers are possible, e.g. share one timer among multiple missing messages. If the request arrives after the timer expires, an update request receiver rejection, called UPDATE_RECV_REJ, will be sent to the sender CD entity and the request will then be discarded. The timer value can be set to the maximum delay jitter on the path from the sender CD entity to the receiver CD entity. An update request is considered lost at the sender CD entity whenever the sender timer (S-timer) corresponding to the request is expired. One way to manage the S-timer of an update request is to start it when the CD entity receives an acknowledgement to some subsequent update requests sent by the CD entity, but not yet received an acknowledgement to the update request. The timer expires when the CD entity does not receive an UPDATE_ACK of that request within the timeout period. The S-timer value can be set to the sum of the initial R-timer value and the maximum delay jitter on the path from the receiver CD entity to the sender CD entity. Once the timer expires, the sender CD entity can discard the update request. An update request sender rejection, called UPDATE_SEND_REJ, will be sent to the receiver CD entity. When a CD entity receives a request rejection, i.e. UPDATE_SEND_REJ or UPDATE_RECV_REJ, the request corresponding to the rejection will be rejected and discarded. The timer for that request will also be cleared. Liu, Le, Leung, and Clanton [Page 28] INTERNET-DRAFT SCRIBE 18 July 2001 3.3. Checkpoint Establishment Procedure To ensure high compression efficiency maintained across sessions and to deal with a detected CD data incoherency between a pair of CD entities, a two-way handshake mechanism for creating a checkpoint is proposed, which is shown in Figure 4. | | (E) | CHKPT_INIT | TO-1 starts |------------------------------->| | | | CHKPT_ACK | (F) TO-1 stops |<-------------------------------| (G) | | | | CD Entity X CD Entity Y Figure 4. Two-Way Handshake for Creating a Checkpoint. The idea behind this is indeed similar to that described for the sliding handshake. Suppose CD Entity X initiates the creation for a checkpoint. It will perform the Checkpoint Initiation Procedure, which is described as follows: E1) Save a copy of the dynamic part of the CD data, updated with all the update requests that are eligible in the temporary sender storage, and all update requests held in the temporary receiver storage, together with a version number, to a repository called a checkpoint store. The updates are applied in the order specified by their total order relationship. The updates are temporary in the sense that the content of the CD data before updates is not lost. E2) Send a checkpoint initiation, called CHKPT_INIT, which includes a strong CD data CRC and the version number, to CD Entity Y. This control message piggybacks with an acknowledgement to a list of received messages. It can also piggybacks to a compressed message. E3) Start the checkpoint re-initiation timer (TO-1). E4) Delay all acknowledgements to SCRIBE message receptions until the corresponding CHKPT_ACK arrives. Liu, Le, Leung, and Clanton [Page 29] INTERNET-DRAFT SCRIBE 18 July 2001 When CD Entity Y receives CHKPT_INIT, it will do the Checkpoint Acknowledgement Procedure, which is described as follows: F1) Verify the CD data CRC with that of the dynamic CD data. If it fails the CRC check, discard the request, initiate a rollback (to be discussed in the next section). Otherwise, continue to Step F3. F2) Store a copy of the dynamic part of the CD data, updated with some update requests that are eligible in the temporary sender storage or the temporary receiver storage, together with the version number carried in a CHKPT_INIT, to a checkpoint store. The set of required updates can be inferred directly from the acknowledgements piggybacked with the compressed message. These updates are ordered according to their total order relationship. F3) Send a checkpoint acknowledgement, called CHKPT_ACK, to CD Entity X. Note that a CHKPT_ACK can be piggybacked to every outgoing message until an acknowledgement to a CHKPT_ACK is received. When CD Entity X receives CHKPT_ACK, it will perform the Checkpoint Done Procedure, which is described as follows: G1) Stop the checkpoint re-initiation timer (TO-1). G2) Resume acknowledgements (possibly delayed) to CD Entity Y for message reception. When CD Entity X receives ROLL_INIT to signal the rejection of a checkpoint request, it will stop the checkpoint re-initiation timer and perform a rollback as instructed (to be detailed in the Section 3.4). When the checkpoint re-initiation timer (TO-1) expires, the Checkpoint Initiation Procedure is redone again by the initiator, i.e. CD Entity X. The whole process will repeat, where the timer initial value is doubled each time the timer is restarted under this way. When both CD entities initiate a CHKPT_INIT at the same time, the one initiated by a master CD entity will go through. Furthermore, the CD data collected at each checkpoint is stored in an identifiable manner (by means of a checkpoint version number) such that any previously stored CD data can be employed for a CD data rollback whenever a CD data desynchronization has detected. Liu, Le, Leung, and Clanton [Page 30] INTERNET-DRAFT SCRIBE 18 July 2001 3.4. Rollback Procedure Whenever a CD data CRC mismatch is found by a CD entity, a two-way handshake mechanism is employed for a CD data rollback, which is exhibited in Figure 5. | | (H) | ROLL_INIT | TO-2 starts |------------------------------->| | | (I) | ROLL_ACK / ROLL_REJ | TO-2 stops |<-------------------------------| (J) | | | | CD Entity X CD Entity Y Figure 5. Two-Way Handshake for Rollback Procedure. Suppose CD Entity X finds a mismatch in the CD data CRC, it initiates a CD data rollback. It will perform the Rollback Initiation Procedure, which is described as follows: H1) Send a rollback initiation, called ROLL_INIT, to CD Entity Y. This control message carries a checkpoint version number to indicate which checkpoint is going to be used for CD data re- synchronization, and a CRC for that particular checkpoint. Note that a ROLL_INIT can be piggybacked to every outgoing message until an acknowledgement to a ROLL_INIT is received. H2) Clear the dynamic part of the CD data, all temporary storages, tables, and related data structures, such as the list for the total order relationship of known update requests. H3) Start the rollback re-initiation timer (TO-2). When CD Entity Y receives ROLL_INIT, it will perform the following steps: I1) Clear the dynamic part of the CD data, all temporary storages, tables, and related data structures, such as the list for the total order relationship of known update requests. I2) Verify the checkpoint CRC carried in ROLL_INIT with the one computed from the requested checkpoint. If the requested checkpoint is available and there is a match in the checkpoint Liu, Le, Leung, and Clanton [Page 31] INTERNET-DRAFT SCRIBE 18 July 2001 CRC, continue to do Step I3 as the Rollback Acknowledgement Procedure; otherwise, continue to do Step I5 as the Rollback Rejection Procedure. I3) Load the dynamic CD data as the specified checkpoint. I4) Send a rollback acknowledgement, called ROLL_ACK, to CD Entity X, and done. Note that a ROLL_ACK can be piggybacked to every outgoing message until an acknowledgement to a ROLL_ACK is received. CD Entity Y does not perform any CD data updates nor send any acknowledgements to received messages, of which the sequence numbers are smaller than that of the first ROLL_INIT seen. I5) Send a rollback rejection, called ROLL_REJ, to CD Entity X. Note that a ROLL_REJ can be piggybacked to every outgoing message until an acknowledgement to a ROLL_REJ is received. When CD Entity X receives ROLL_ACK, it will perform the Rollback Done Procedure, which is described as follows: J1) Stop the rollback re-initiation timer (TO-2). J2) Load the dynamic CD data as the specified checkpoint. When CD Entity X receives ROLL_REJ, it will perform the Rollback Re- initiation Procedure, which is described as follows: J3) Stop the rollback re-initiation timer (TO-2). J4) Initiate a new rollback to CD Entity Y, with another checkpoint version number. Note that a specific checkpoint version number can be used to indicate a empty checkpoint, which is always available and coherent. When the rollback re-initiation timer (TO-2) expires, ROLL_INIT is re-transmitted again by the initiator, i.e. CD Entity X. The whole process will then repeat. When both CD entities initiate a rollback approximately the same time, the one initiated by a master CD entity will go through. Rollback is also needed whenever a checkpoint initiation is rejected. An example is that a checkpoint CRC validation fails. Suppose CD Entity X sends a checkpoint initiation to CD Entity Y. However, the request is then rejected by CD Entity Y, which is shown in Figure 6. CD Entity Y simply sends a checkpoint initiation to signal the rejected request to CD Entity X. Whenever CD Entity X receives the Liu, Le, Leung, and Clanton [Page 32] INTERNET-DRAFT SCRIBE 18 July 2001 initiation, it does a rollback as requested and a rollback acknowledgement is sent to CD Entity Y to acknowledge the completion of the rollback. | | | CHKPT_INIT | |------------------------------->| | | | ROLL_INIT | |<-------------------------------| | | | ROLL_ACK | |------------------------------->| | | CD Entity X CD Entity Y Figure 6. An Example Showing How the Rollback Procedure is Used to Reject a Checkpoint Initiation. 3.5. Template Related Procedures 3.5.1. Compression and Decompression Procedures This section describes the procedures how to compress and decompress a message using a template. Refer to Section 4.1.6 for encoding formats. Compression Procedure TC1. Find the best template whose static fields match the given message. If no such template is found, stop; TC2. Start from the beginning of the message to the end, extract the string corresponding to each blank field. Prefix the string with an one-byte length field. The result is called a Filling Field (FF). TC3. Repeat Step TC2 until reaching the end of the message. The result will be a list of Filling Fields (FFL). TC4. Prefix the FFL with a C-bit and the 7-bit TID. The C-bit is used to indicate whether the FFL is compressed or not. Set the C-bit to 0 (zero) in this step. Liu, Le, Leung, and Clanton [Page 33] INTERNET-DRAFT SCRIBE 18 July 2001 TC5. Compress the FFL using the profile-specific dictionary and set the C-bit to 1. This step is optional and can be skipped. TC6. Transfer the final result (C-bit, TID, and FFL) as the compressed message in either the Long Sequence Number SCRIBE message (Section 4.1.2) or the No Sequence Number SCRIBE message (Section 4.1.3). Note that the CM field MUST be set to 2. Note: The steps are shown for clarity. During the implementation, the procedure can be optimized by combining Steps TC1, TC2, and TC3. Note: There is no need to transfer the number of FFs to the decompressor as it is part of the definition of a template. Decompression Procedure TD1. Fetch the template identified by TID. If not found, the message is regarded as corrupted. Stop and discard the message. TD2. If C-bit is 0 (zero), go to the next step. Otherwise, decompress the FFL using the profile-specific dictionary. This is a reverse operation of Step TC5. TD3. Construct the original message using the static parts stored in the template and the filling strings in the FFL. TD4. Perform the CRC check. Discard the message if it fails the check. 3.5.2. Template Management Procedures Since a static set of templates (i.e. template table) is used in this document, the main task of template management is to ensure the two CD entities have the same template table before it is used. This is achieved through the procedure as defined in Section 3.1 because template table is managed as a part of the profile-specific CD data. The management message formats are defined in Section 4.2. 3.6. Implicit CD Data Content Deletion To achieve the maximum compression efficiency, a CD entity may want to delete part of the CD data to make room for the new content. This is especially important if the CD entity has limited memory to store the CD data (as in the case of a mobile terminal). Liu, Le, Leung, and Clanton [Page 34] INTERNET-DRAFT SCRIBE 18 July 2001 With the implicit deletion, some part(s) of the CD data are chosen to be deleted according to a well defined algorithm that is known and applied in the same way at both CD entities. As input into the algorithm, one provides the total size to be deleted (e.g. number of bytes), and the algorithm specifies which part(s) are to be deleted. The freed room can then allow new content to be added to the CD data store. Since the same algorithm is applied at both CD entities, there is no need to explicitly signal which part(s) have been deleted. In particular, when the allocated memory has been filled up, each CD entity, when it wants to add items of combined size S, will implicitly delete part(s) of combined size S. Since there is no need for signalling, the scheme is more efficient (lower signalling overhead) and more robust (less prone to errors or losses affecting the signalling information). It is assumed that the two CD entities agree on the maximum size of the CD data store, either by a predefined constant or through negotiation before or at the beginning of a session. Figure 8 illustrates the procedure for the case where FIFO (First In First Out) policy is used for managing the dynamic CD data. In most cases, keeping the most recent content of the dynamic CD data will improve compression efficiency. least | +---------------+ / +---------------+ recent | |Occupied Memory| | | | | | (part to be | | |Occupied Memory| | | deleted) | /--> | (kept during | | +---------------+ \ / | | addition) | | | | | / | | | | |Occupied Memory| | / \ +---------------+ \ | | (part to be | >- | | | | | kept) | | + - - - - - - - + | New message | | | | | Extra | > to be added | +---------------+ / | space | | most | | Unused Memory | | needed | | recent V +---------------+ +---------------+ / Dynamic CD data before Dynamic CD data after adding a new message adding a new message Figure 8. An Example For Implicit CD Data Content Deletion. Liu, Le, Leung, and Clanton [Page 35] INTERNET-DRAFT SCRIBE 18 July 2001 3.7. State Diagrams 3.7.1. Checkpoint Establishment Procedure +-------------+ /---->| Established |<----\ | +-------------+ | | | | | | | | | | | | | (4)| |(1) (6)| |(7) | | | | | | | | | V V | +---------------------+ +---------------+ __| Waiting for |----->| Waiting for | | | CHKPT_INIT Response | (5) | CHKPT_ACK ACK | | +---------------------+ +---------------+ | ^ |_____________| (2,3) Figure 9. State Diagram for Checkpoint Establishment Procedure. Events / Actions: (1) Checkpoint for CD data needed / Execute Checkpoint Initiation Procedure (2) CHKPT_INIT from slave CD entity received / Discard CHKPT_INIT (3) TO-1 expired / Execute Checkpoint Initiation Procedure (4) CHKPT_ACK received / Execute Checkpoint Done Procedure (5) (CHKPT_INIT from master CD entity) and (Event 6) / (Stop TO-1) and (Execute Checkpoint Acknowledgement Procedure) (6) (CHKPT_INIT received) and (CD data CRC matched) / Execute Checkpoint Acknowledgement Procedure Liu, Le, Leung, and Clanton [Page 36] INTERNET-DRAFT SCRIBE 18 July 2001 (7) CHKPT_ACK acknowledged / Note: Event 2 does not happen on the slave CD entity, and Event 5 does not happen on the master CD entity. States: * Established: No rollback and no checkpoint establishment is under way. * Waiting for CHKPT_INIT Response: The CD entity has sent a CHKPT_INIT and is waiting for a CHKPT_ACK from its peer CD entity. Acknowledgements to SCRIBE message receptions are deferred till the corresponding CHKPT_ACK arrives. * Waiting for CHKPT_ACK ACK: The CD entity has sent a CHKPT_ACK and it is waiting for its acknowledgement from its peer CD entity. A CHKPT_ACK is piggybacked to every outgoing SCRIBE message, with the C-bit being set to 1. 3.7.2. Rollback Procedure Liu, Le, Leung, and Clanton [Page 37] INTERNET-DRAFT SCRIBE 18 July 2001 ________________________________________________ | (11) | | | +-------------+<-------------\ +-----------+ | | Established |______ | /--| [A] | | +-------------+ | | | +-----------+ | ^ | | | | | | | | | | | | | | | | | | | | (5)| |(1) |(8) (10)| |(8) |(11) | | | | | | | | | | | | | | | | V | | V | V +------------------+ | +--------------+ | +-----------------+ _| Waiting for | \-->| Waiting for | \->| Waiting for |_ | |ROLL_INIT Response|----->| ROLL_ACK ACK |<-----|ROLL_REJ Response| | | +------------------+ (6) +--------------+ (8) +-----------------+ | | ^ | | ^ ^ ^ | |________| | |____| | |_______| (2,3,4) | (9) | (11) | | |_________________________________________| (7) Figure 10. State Diagram for Rollback Procedure. Events / Actions: (1) CD data believed to be incoherent / Execute Rollback Initiation Procedure (2) (ROLL_INIT from slave CD entity) and (Event 9) / Discard ROLL_INIT (3) ROLL_REJ received / Execute Rollback Re-initiation Procedure (4) TO-2 expired / Resend ROLL_INIT (5) ROLL_ACK received / Execute Rollback Done Procedure (6) (ROLL_INIT from master CD entity) and (Event 8) / (Stop TO-2) and (Execute Rollback Acknowledgement Procedure) Liu, Le, Leung, and Clanton [Page 38] INTERNET-DRAFT SCRIBE 18 July 2001 (7) (ROLL_INIT from master CD entity) and (Event 11) / (Stop TO-2) and (Execute Rollback Rejection Procedure) (8) (ROLL_INIT received) and (Checkpoint available) and (Checkpoint CRC matched) / Execute Rollback Acknowledgement Procedure (9) ROLL_INIT received / Discard ROLL_INIT (10) ROLL_ACK acknowledged / (11) (ROLL_INIT received) and ((Checkpoint unavailable) or (Checkpoint CRC mismatched)) / Execute Rollback Rejection Procedure Note: In Event 1, a CD entity believes the CD data is incoherent when it has encountered M instances of compressed message CRC mismatches and/or N instances of CD data CRC mismatches, where M and N are positive integers. For example, it believes the CD data is incoherent when a checkpoint CRC validation fails upon the reception of a CHKPT_INIT. Besides, Event 2 does not happen on the slave CD entity, and Events 6 and 7 do not happen on the master CD entity. States: * Established: No rollback and no checkpoint establishment is under way, the dynamic CD data update can take place. * Waiting for ROLL_INIT Response: The CD entity has sent a ROLL_INIT and it is waiting for a response, i.e. ROLL_ACK or ROLL_REJ, from its peer CD entity. Each protocol message is compressed with an empty dynamic CD data. A ROLL_INIT is piggybacked with every compressed protocol message. The C-bit and U-bit are set to 1. No CHKPT_INIT is sent and all received CHKPT_INITs are discarded. * Waiting for ROLL_ACK ACK: The CD entity has sent a ROLL_ACK and it is waiting for its acknowledgement from its peer CD entity. For the purpose of decompression, the dynamic CD data is empty. A ROLL_ACK is piggybacked to every outgoing SCRIBE message, with the C-bit being set to 1. No CHKPT_INIT is sent and all received CHKPT_INITs are discarded. Liu, Le, Leung, and Clanton [Page 39] INTERNET-DRAFT SCRIBE 18 July 2001 * Waiting for ROLL_REJ Response: The CD entity has sent a ROLL_REJ and it is waiting for a response, i.e. a new ROLL_INIT, from its peer CD entity. Each protocol message is compressed without employing the dynamic CD data. A ROLL_REJ is piggybacked to every compressed message. The C-bit is being set to 1. No CHKPT_INIT is sent and all received CHKPT_INITs are discarded. * [A]: It represents a set of states including Waiting for CHKPT_INIT Response and Waiting for CHKPT_ACK ACK. 3.7.3. State Verification and Data Downloading Procedure E-ms ==== +---------------+ | Idle | +---------------+ | | | | | | (1)| |(4) | | | | V V +---------------+ +------------------+ __| Waiting for | | Waiting for | | | STATE_ADV ACK | | DOWNLOAD_REP ACK | | +---------------+ +------------------+ | ^ | | |_________| | | (2) | | |(3) (5)| | | | | V V +---------------+ | Established | +---------------+ Liu, Le, Leung, and Clanton [Page 40] INTERNET-DRAFT SCRIBE 18 July 2001 E-net ===== +----------------+ | Idle | +----------------+ | | | | | | (6)| |(9) | | | | V V +-------------------+ +------------------+ __| Waiting for | | Waiting for |__ | | STATE_ADV_ACK ACK | | DOWNLOAD_REQ ACK | | | +-------------------+ +------------------+ | | ^ | | ^ | |___________| | | |___________| (7) | | (10) |(8) (11)| | | | | V V +---------------+ | Established | +---------------+ Figure 11. State Diagram for State Verification and Data Downloading Procedure. Events / Actions: B (1) State advertisement needed / Execute State Advertisement Procedure (2) (TO-3 expired) or ((STATE_ADV_ACK received) and (no matching item)) / Execute State Advertisement Procedure (3) (STATE_ADV_ACK received) and (a matching item existed) / Execute State Advertisement Done Procedure Liu, Le, Leung, and Clanton [Page 41] INTERNET-DRAFT SCRIBE 18 July 2001 (4) DOWNLOAD_REQ received / Perform Downloading Response Procedure (5) DOWNLOAD_REP acknowledged / (6) STATE_ADV received / Perform State Advertisement Acknowledgement Procedure (7) TO-5 expired / Perform State Advertisement Acknowledgement Procedure (8) STATE_ADV_DONE received / Stop TO-5 (9) Data downloading needed / Perform Downloading Request Procedure (10) TO-4 expired / Perform Downloading Request Procedure (11) DOWNLOAD_REP received / Perform Downloading Done Procedure States: * Idle: No compressor/decompressor CD data is established between two CD entities. * Established: No rollback and no checkpoint establishment is under way. * Waiting for STATE_ADV ACK: The CD entity has sent a STATE_ADV and it is waiting for a STATE_ADV_ACK from its peer CD entity. * Waiting for STATE_ADV_ACK ACK: The CD entity has sent a STATE_ADV_ACK and it is waiting for STATE_ADV_DONE from its peer CD entity. * Waiting for DOWNLOAD_REQ ACK: The CD entity has sent a DOWNLOAD_REQ to a D-server (can be the E-ms) and it is waiting for a DOWNLOAD_REP from the D-server. Liu, Le, Leung, and Clanton [Page 42] INTERNET-DRAFT SCRIBE 18 July 2001 * Waiting for DOWNLOAD_REP ACK: The CD entity has sent a DOWNLOAD_REP and it is waiting for its acknowledgement from its peer CD entity. Note: When a CD entity is at the following states: Idle, Waiting for STATE_ADV ACK, Waiting for STATE_ADV_ACK ACK, Waiting for DOWNLOAD_REQ ACK, and DOWNLOAD_REP ACK, each protocol message is compressed using the static CD data. 4. SCRIBE Message Formats The following sub-sections describe two kinds of SCRIBE message formats: the normal message formats that are usually used during sessions; and the special message formats that are used for CD data management signalling, which usually happens before a session begins. Note that the normal SCRIBE messages can also carry CD data management signalling information. Each SCRIBE message starts with a Packet Type (PT) field. The values defined in this draft are: * PT = 0: for normal messages with a short sequence number; * PT = 10: for normal messages with a long sequence number; * PT = 110: for special CD data management signalling messages; * PT = 1110: for normal messages with no sequence number; A SCRIBE message may carry a sub-type field after PT. The convention of ROHC [RFC3095] packet formats will be used: - colons ":" indicate that the part is optional - slashes "/" indicate a variable length field 4.1. Normal Message Formats Each normal SCRIBE message is assigned a sequence number by the sender (see exception below). The sender maintains locally a two- byte counter for the purpose of the assignment. The counter is incremented by 1 after each assignment. Liu, Le, Leung, and Clanton [Page 43] INTERNET-DRAFT SCRIBE 18 July 2001 In the SCRIBE message header, only least significant bits (LSB) of a sequence number are carried. The W-LSB encoding, as described in Section 4.5.2 of [RFC3095], is used here. To tolerate message misordering, p value is set to 2^(k-2) - 1. The encoding rules determine which SCRIBE message formats (with short or long sequence number) should be used by the sender. Furthermore, the short sequence number format can only carry a compressed message using the default compression algorithm, which is the dictionary-based compression unless configured otherwise. One exception to the above is the no sequence number format. It is optimized for systems that are configured to use the static and profile-specific CD data only. 4.1.1. Short Sequence Number Liu, Le, Leung, and Clanton [Page 44] INTERNET-DRAFT SCRIBE 18 July 2001 +---+---+---+---+---+---+---+---+ | 0 | seq | A | C | U | +---+---+---+---+---+---+---+---+ | CRC | +---+---+---+---+---+---+---+---+ : : if A = 1 / ACK_INFO / variable length : : +---+---+---+---+---+---+---+---+ : : if C = 1 / CONTROL_INFO / variable length : : +---+---+---+---+---+---+---+---+ : : / compressed message / variable length : : (could be zero) +---+---+---+---+---+---+---+---+ seq - The 4 LSBs of the sequence number of this message. A - ACK bit, indicating the presence of ACK information element (ACK_info). See Section 4.1.4. if A = 1, ACK_info is present = 0, not present C - Control Information bit, indicating the presence of a CD data control element (CONTROL_INFO). See section 4.1.5. if C = 1, CONTROL_INFO is present = 0, not present U - UPDATE_INIT bit. If U = 1, this message will be used to update the CD data. Otherwise, it will not. CRC - Calculated over concatenation of the SCRIBE message and the original message (if this SCRIBE message contains a compressed message). Note: It is assumed that the total length of a SCRIBE message is provided by the lower layer (be it UDP or TCP or link layer). Since the length of the header part is self-contained, the length of the compressed message can be derived by subtracting the header length from the total SCRIBE message length. The compressed message length could be zero (i.e. a stand alone SCRIBE message for CD data management signalling only). Liu, Le, Leung, and Clanton [Page 45] INTERNET-DRAFT SCRIBE 18 July 2001 4.1.2. Long Sequence Number +---+---+---+---+---+---+---+---+ | 1 | 0 | CM | A | C | U | +---+---+---+---+---+---+---+---+ | seq | +---+---+---+---+---+---+---+---+ | CRC | +---+---+---+---+---+---+---+---+ : : if A = 1 / ACK_INFO / variable length : : +---+---+---+---+---+---+---+---+ : : if C = 1 / CONTROL_INFO / variable length : : +---+---+---+---+---+---+---+---+ : : / compressed message / variable length : : (could be zero) +---+---+---+---+---+---+---+---+ CM - Compression Method, indicating which compression method was used to compress the original message. CM = 0, if NULL compression was used (i.e. the message was uncompressed) 1, if dictionary-based compression was used 2, if template-based compression was used other values are reserved for future extension Other fields are the same as in the previous section. Note: The NULL compression is intended to be used as a fallback under abnormal conditions. Note: Refer to Section 4.1.6 for the compressed message format when a template is used. 4.1.3. No Sequence Number Liu, Le, Leung, and Clanton [Page 46] INTERNET-DRAFT SCRIBE 18 July 2001 +---+---+---+---+---+---+---+---+ | 1 | 1 | 1 | 0 | CM | +---+---+---+---+---+---+---+---+ | CRC | +---+---+---+---+---+---+---+---+ : : / compressed message / variable length : : (could be zero) +---+---+---+---+---+---+---+---+ CM and CRC are the same as in previous section. 4.1.4. Acknowledgement (ACK_INFO) The first 2 bits indicate the type of ACK_INFO. The sender SHOULD use the smallest message format that can sufficiently convey the information. 1) Accumulative acknowledgement (ACCU_ACK) +---+---+---+---+---+---+---+---+ | 0 | 0 | ack_seq | +---+---+---+---+---+---+---+---+ ack_seq - The 6 LSBs of the sequence number of the SCRIBE message to be acknowledged. This format is used only if the receiver has received all the SCRIBE messages with sequence numbers equal to or less than ack_seq, AND it accepts all the UPDATE_INITs piggybacked on those messages. 2) ACK with negative information (ACK_NEG) +---+---+---+---+---+---+---+---+ | 0 | 1 | ack_seq | +---+---+---+---+---+---+---+---+ | | / NEGATIVE_INFO / | | +---+---+---+---+---+---+---+---+ ack_seq - Same as above. Liu, Le, Leung, and Clanton [Page 47] INTERNET-DRAFT SCRIBE 18 July 2001 This format can be used when some SCRIBE messages with sequence numbers smaller than ack_seq have not been received. The NEGATIVE_INFO contains a list of elements, each of which can be either a NEG_BM or NEG_SEQ: NEG_BM +---+---+---+---+---+---+---+---+ | E |B=1| missing_bit_mask | +---+---+---+---+---+---+---+---+ NEG_SEQ +---+---+---+---+---+---+---+---+ | E |B=0| missing_seq | +---+---+---+---+---+---+---+---+ The NEGATIVE_INFO MUST be parsed by the receiver from the beginning to the end as follows: a) Initialize: ref_seq = decompress(ack_seq) and bm_offset = 0; b) If the element is a NEG_SEQ, decompress missing_seq. The result is the sequence number of a SCRIBE message that has not been received. Reset ref_seq = decompress(missing_seq) and bm_offset = 0; c) If the element is a NEG_BM, read the bit pattern from the most significant bit to the least significant bit (i.e. from left to right). When a bit is read, first increment bm_offset by 1. Then check whether the bit value is 1. If it is so, a SCRIBE message with the sequence number (ref_seq - bm_offset) is missing; and d) If the E-bit of the current element equals to 1, this is the last element of the NEGATIVE_INFO. Stop. Otherwise, go to Step b and continue the parsing process. Note that the above format gives the flexibility to handle different message loss patterns. Liu, Le, Leung, and Clanton [Page 48] INTERNET-DRAFT SCRIBE 18 July 2001 3) ACK with long sequence number (ACK_LONG) +---+---+---+---+---+---+---+---+ | 1 | 0 |RES| neg_elem_num | +---+---+---+---+---+---+---+---+ | ack_seq | +---+---+---+---+---+---+---+---+ : : / NEG_ELEM_LIST / length = 2 * neg_elem_num : : (can be zero) +---+---+---+---+---+---+---+---+ This format is used when the sender needs to send 8 LSBs of a sequence number (either ACKed or NACKed). RES - Reserved bit. MUST be set to zero. neg_elem_num - The number of NEG_ELEM in the NEG_ELEM_LIST, can be zero. ack_seq - Same as above. NEG_ELEM_LIST - A list of NEG_ELEMs (see below). Each NEG_ELEM is two-byte long and has the following format: +---+---+---+---+---+---+---+---+ | missing_seq | +---+---+---+---+---+---+---+---+ | missing_bit_mask | +---+---+---+---+---+---+---+---+ The interpretation of missing_bit_mask is the same as that in NEG_BM mentioned above, using decompress(missing_seq) in the same NEG_ELEM as the reference. Essentially, each NEG_ELEM represents a cluster of missing messages. Note: missing_bit_mask may contain only zeros, in which case the field is wasted. This is a tradeoff between coding efficiency and simplicity. Note: The maximum value of neg_elem_num is 29 = ceiling(256 / (1 + 8)). This means that the maximum length of NEG_ELEM_LIST = 2 * 29 = 58 bytes. This is also the reason why neg_elem_num field is 5-bit long. Liu, Le, Leung, and Clanton [Page 49] INTERNET-DRAFT SCRIBE 18 July 2001 4.1.5. CONTROL_INFO The element is used to carry information related to a) checkpoint establishment procedure; b) rollback procedure; and c) update rejection (either UPDATE_RECV_REJ or UPDATE_SEND_REJ). The first 3 bits of the field indicate what information is carried inside. Note that the bit pattern "111" is unused in this draft and reserved for future use. 4.1.5.1. Checkpoint CONTROL_INFO Each checkpoint is assigned with a five-bit version number, which gives a range from 0 to 31. In addition, version 0 is reserved to indicate the empty checkpoint. The version number of the new checkpoint is always the current version number plus one, except that Version 0 is skipped. 1) CHKPT_INIT +---+---+---+---+---+---+---+---+ | 0 | 0 | 0 | dict_ver | +---+---+---+---+---+---+---+---+ | | + CRC + 2 octets | | +---+---+---+---+---+---+---+---+ dict_ver - The version number assigned to this checkpoint. CRC - Calculated over the checkpoint. 2) CHKPT_ACK +---+---+---+---+---+---+---+---+ | 0 | 0 | 1 | dict_ver | +---+---+---+---+---+---+---+---+ dict_ver - Copied from the CHKPT_INT information element. Liu, Le, Leung, and Clanton [Page 50] INTERNET-DRAFT SCRIBE 18 July 2001 4.1.5.2. Rollback CONTROL_INFO 1) ROLL_INIT +---+---+---+---+---+---+---+---+ | 0 | 1 | 0 | dict_ver | +---+---+---+---+---+---+---+---+ | | + CRC + 2 octets | | +---+---+---+---+---+---+---+---+ dict_ver - The version number of the checkpoint that the sender wants to roll back to. CRC - Calculated over the target checkpoint. 2) ROLL_ACK +---+---+---+---+---+---+---+---+ | 0 | 1 | 1 | dict_ver | +---+---+---+---+---+---+---+---+ dict_ver - Copied from the ROLL_INIT information element. 3) ROLL_REJ +---+---+---+---+---+---+---+---+ | 1 | 0 | 0 | dict_ver | +---+---+---+---+---+---+---+---+ dict_ver - Copied from the ROLL_INIT information element. 4.1.5.3. Update Rejection CONTROL_INFO This CONTROL_INFO is used to reject update requests which are either received by the sender of this CONTROL_INFO (i.e. UPDATE_RECV_REJ) or initiated by it (i.e. UPDATE_SEND_REJ). Each update request is identified by the sequence number of the SCRIBE message to which it was piggybacked. Liu, Le, Leung, and Clanton [Page 51] INTERNET-DRAFT SCRIBE 18 July 2001 +---+---+---+---+---+---+---+---+ | 1 | 0 | 1 | R | S | +---+---+---+---+---+---+---+---+ : recv_rej_num : if R = 7 +---+---+---+---+---+---+---+---+ : send_rej_num : if S = 3 +---+---+---+---+---+---+---+---+ : recv_rej_seq-0 : +---+---+---+---+---+---+---+---+ ~ ~ ~ ~ +---+---+---+---+---+---+---+---+ : recv_rej_seq-last : +---+---+---+---+---+---+---+---+ : send_rej_seq-0 : +---+---+---+---+---+---+---+---+ ~ ~ ~ ~ +---+---+---+---+---+---+---+---+ : send_rej_seq-last : +---+---+---+---+---+---+---+---+ R - If value < 7, it indicates the number of recv_rej_seq; otherwise, it indicates the presence of recv_rej_num. S - If value < 3, it indicates the number of send_rej_seq; otherwise, it indicates the presence of send_rej_num. recv_rej_num - If present, indicates the number of recv_rej_seq. send_rej_num - If present, indicates the number of send_rej_seq. 4.1.5.4. CD Data Update Using Additional Content +---+---+---+---+---+---+---+---+ | 1 | 1 | 0 | length | +---+---+---+---+---+---+---+---+ | length | +---+---+---+---+---+---+---+---+ | | / content / variable length | | +---+---+---+---+---+---+---+---+ length - Indicates the length of content that is used to update the dynamic part of the CD data. Liu, Le, Leung, and Clanton [Page 52] INTERNET-DRAFT SCRIBE 18 July 2001 4.1.6. Compressed Message Format Using Template When template-based compression is used, the format of the compressed message can be either TBCM-1 or TBCM-2. TBCM-1, Filling Field List (FFL) is uncompressed: +---+---+---+---+---+---+---+---+ |C=0| TID | +---+---+---+---+---+---+---+---+ | | / Filling-Field-0 / variable length | | +---+---+---+---+---+---+---+---+ ~ ~ ~ ~ +---+---+---+---+---+---+---+---+ : : / Filling-Field-i / variable length : : +---+---+---+---+---+---+---+---+ ~ ~ ~ ~ +---+---+---+---+---+---+---+---+ : : / Filling-Field-n / variable length : : +---+---+---+---+---+---+---+---+ Format of a Filling Field (FF): +---+---+---+---+---+---+---+---+ | Length-i | +---+---+---+---+---+---+---+---+ | | / Filling-String-i / length = Length-i | | +---+---+---+---+---+---+---+---+ Liu, Le, Leung, and Clanton [Page 53] INTERNET-DRAFT SCRIBE 18 July 2001 TBCM-2, FFL is compressed: +---+---+---+---+---+---+---+---+ |C=1| TID | +---+---+---+---+---+---+---+---+ | | / compressed FFL / variable length | | +---+---+---+---+---+---+---+---+ 4.2. Special CD Data Management Message Formats 4.2.1. General Formats Liu, Le, Leung, and Clanton [Page 54] INTERNET-DRAFT SCRIBE 18 July 2001 1) State Advertisement (STATE_ADV) +---+---+---+---+---+---+---+---+ | 1 | 1 | 0 | sub-type = 0 | +---+---+---+---+---+---+---+---+ | RES | P | dict_ver | +---+---+---+---+---+---+---+---+ | | + CRC-packet + 2 octets | | +---+---+---+---+---+---+---+---+ : : + CRC-profile + if P = 1 : : (2 octets) +---+---+---+---+---+---+---+---+ : : + CRC-dynamic + if dict_ver != 0 : : (2 octets) +---+---+---+---+---+---+---+---+ RES - Reserved bits. MUST be set to 0 (zero). P - When set to 1, indicates the presence of CRC-profile. dict_ver - Indicates the version number of the checkpoint. CRC-packet - Calculated over the SCRIBE message itself to detect transmission errors. CRC-profile - Calculated over the S-profile (see below). CRC-dynamic - Calculated over the checkpoint. For the purpose of CRC calculation, the S-profile MUST be modeled as follows: +---+---+---+---+---+---+---+---+ 1 byte (CD Algorithm attribute, | CDA value | refer to Section 2.6) +---+---+---+---+---+---+---+---+ 1 byte (CD Data Management Signalling | CDDMS value | attribute, refer to Section 2.6) +---+---+---+---+---+---+---+---+ 1 byte (CD Data Update attribute, | CDDUA value | refer to Section 2.6), if present +---+---+---+---+---+---+---+---+ : : / Profile-specific CD data / variable length (at least 1 byte) : : +---+---+---+---+---+---+---+---+ The profile-specific CD data MUST be modeled as follows: Liu, Le, Leung, and Clanton [Page 55] INTERNET-DRAFT SCRIBE 18 July 2001 +---+---+---+---+---+---+---+---+ : : / Template table / variable length (at least 1 byte) : : +---+---+---+---+---+---+---+---+ : : / Profile-specific dictionary / variable length (could be zero) : : +---+---+---+---+---+---+---+---+ Note: For the purpose of CRC calculation, each template in the template table MUST be in the uncompressed (logical) format. See Section 4.2.2. 2) State Advertisement Acknowledgement (STATE_ADV_ACK) +---+---+---+---+---+---+---+---+ | 1 | 1 | 0 | sub-type = 1 | +---+---+---+---+---+---+---+---+ | RES | P | dict_ver | +---+---+---+---+---+---+---+---+ | | + CRC-packet + 2 octets | | +---+---+---+---+---+---+---+---+ RES - Reserved bits. MUST be set to 0 (zero). P-bit - Set to 1 if the sender has the S-profile which has passed the CRC test; otherwise, set to 0 (zero). dict-ver - Set to the version of the checkpoint which the sender has and which has passed the CRC test; otherwise, set to 0 (zero). CRC-packet - Calculated over the SCRIBE message itself Liu, Le, Leung, and Clanton [Page 56] INTERNET-DRAFT SCRIBE 18 July 2001 3) State Advertisement Done (STATE_ADV_DONE) +---+---+---+---+---+---+---+---+ | 1 | 1 | 0 | sub-type = 2 | +---+---+---+---+---+---+---+---+ | RES | P | dict_ver | +---+---+---+---+---+---+---+---+ | | + CRC-packet + 2 octets | | +---+---+---+---+---+---+---+---+ RES - Reserved bits. MUST be set to 0 (zero). CRC-packet - Calculated over the SCRIBE message itself. P-bit and dict_ver must be copied from the corresponding STATE_ADV_ACK message. Liu, Le, Leung, and Clanton [Page 57] INTERNET-DRAFT SCRIBE 18 July 2001 4) DOWNLOAD_REQ +---+---+---+---+---+---+---+---+ | 1 | 1 | 0 | sub-type = 3 | +---+---+---+---+---+---+---+---+ |RES| P | D | dict_ver | +---+---+---+---+---+---+---+---+ | | + CRC-packet + 2 octets | | +---+---+---+---+---+---+---+---+ / user-ID / variable length +---+---+---+---+---+---+---+---+ : : + CRC-profile + if P = 1 : : (2 octets) +---+---+---+---+---+---+---+---+ : : + CRC-dynamic + if dict_ver != 0 : : (2 octets) +---+---+---+---+---+---+---+---+ RES - Reserved bit. MUST be set to 0 (zero). P - When set to 1, indicates the sender wants to download the S-profile for the user. D - When set to 1, indicates the sender wants to download the checkpoint (see notes below). user-ID - Used to identify the user, encoded according to Section 4.5.6 in [RFC3095]. Note: If D = 1 and dict_ver = 0 (zero), this means the sender wants to download the latest checkpoint regardless its version number. Note: If D = 0 (zero), the dict_ver MUST be set to 0 (zero) by the sender. A SCRIBE message with D = 0 and dict_ver != 0 MUST be discarded by the receiver. Liu, Le, Leung, and Clanton [Page 58] INTERNET-DRAFT SCRIBE 18 July 2001 5) DOWNLOAD_REP +---+---+---+---+---+---+---+---+ | 1 | 1 | 0 | sub-type = 4 | +---+---+---+---+---+---+---+---+ |RES| P | D | dict_ver | +---+---+---+---+---+---+---+---+ | | + CRC-packet + 2 octets | | +---+---+---+---+---+---+---+---+ / user-ID / variable length +---+---+---+---+---+---+---+---+ : : / length-P / if P = 1 : : (variable length) +---+---+---+---+---+---+---+---+ : : / length-D / if dict_ver != 0 : : (variable length) +---+---+---+---+---+---+---+---+ : : / profile-specific CD data / if P = 1 : : (variable length) +---+---+---+---+---+---+---+---+ : : / checkpoint / if D = 1 : : (variable length) +---+---+---+---+---+---+---+---+ RES - Reserved bit. MUST be set to 0 (zero). CRC-packet - Calculated over the SCRIBE message itself. user-ID - MUST be copied from the corresponding DOWNLOAD_REQ message. length-P - Indicates the length of the profile-specific CD data, encoded according to Section 4.5.6 in [RFC3095]. length-D - Indicates the checkpoint length, encoded according to Section 4.5.6 in [RFC3095]. 4.2.2. Template and Template Table Formats This sub-section specifies the template table formats that are used for template data downloading/synchronization. Liu, Le, Leung, and Clanton [Page 59] INTERNET-DRAFT SCRIBE 18 July 2001 Format of a Static Field (SF): +---+---+---+---+---+---+---+---+ | Length-i | +---+---+---+---+---+---+---+---+ | | / Static-String-i / length = Length-i | | +---+---+---+---+---+---+---+---+ Format of an uncompressed template: (Also referred to as the logical format of a template) +---+---+---+---+---+---+---+---+ |C=0| TID | +---+---+---+---+---+---+---+---+ | t1| num_of_fields | +---+---+---+---+---+---+---+---+ | | / SF-0 / variable length | | +---+---+---+---+---+---+---+---+ ~ ~ ~ ~ +---+---+---+---+---+---+---+---+ : : / SF-i / variable length : : +---+---+---+---+---+---+---+---+ ~ ~ ~ ~ +---+---+---+---+---+---+---+---+ : : / SF-last / variable length : : +---+---+---+---+---+---+---+---+ C - Set to 0 (zero) to indicate the Static Field List (SFL) after num_of_fields field is not compressed. t1 - Indicates the type of the first field in this template. 0 (zero) means a blank field. 1 means a static field. num_of_fields - Total number of fields in the template, including both static and blank fields. Liu, Le, Leung, and Clanton [Page 60] INTERNET-DRAFT SCRIBE 18 July 2001 Note: By combining the values of t1 and the num_of_fields field, one can derive the overall structure of the template because a blank field must follow a static field by definition, and vice versa. Format of a compressed template: +---+---+---+---+---+---+---+---+ |C=1| TID | +---+---+---+---+---+---+---+---+ | num_of_fields | +---+---+---+---+---+---+---+---+ | | / compressed SFL / variable length | | +---+---+---+---+---+---+---+---+ C - Set to 1 to indicate the Static Field List (SFL) is compressed using the profile-specific dictionary. Note: The receiver of a compressed template will first decompress the compressed SFL. The result will be the template in the uncompressed format. Liu, Le, Leung, and Clanton [Page 61] INTERNET-DRAFT SCRIBE 18 July 2001 Format of a template table: +---+---+---+---+---+---+---+---+ | num_of_templates | +---+---+---+---+---+---+---+---+ | | / template-0 / variable length | | +---+---+---+---+---+---+---+---+ ~ ~ ~ ~ +---+---+---+---+---+---+---+---+ : : / template-i / variable length : : +---+---+---+---+---+---+---+---+ ~ ~ ~ ~ +---+---+---+---+---+---+---+---+ : : / template-last / variable length : : +---+---+---+---+---+---+---+---+ 5. Conclusions The draft defines a scheme that is generically applicable to compress signalling protocol messages when the transport between compressor and decompressor is unreliable. With the key concept of cross- section CD data, the scheme achieves high compression efficiency even for the initial messages in a session. Good compression of initial messages, when they are large (which is especially true for protocols like SIP), yields proportionately higher overall compression ratio. Other concepts and mechanisms, including the checkpoint, profile- specific CD data, and sliding handshake provide additional robustness and efficiency. Delete mechanisms allow to accommodate limited memory devices, which makes it particularly suitable for mobile terminals. Liu, Le, Leung, and Clanton [Page 62] INTERNET-DRAFT SCRIBE 18 July 2001 6. Security Considerations Because encryption eliminates the redundancy that message compression schemes try to exploit, there is some inducement to enforce compression of signalling messages at a protocol layer above where ciphering takes place. This means that a message may be compressed before it is encrypted, and the compressed message may be decompressed after it is decrypted. A secured communication channel between a pair of CD entities, i.e. E-ms and E-net, may be needed to provide integrity protection and ciphering of SIP signalling. When the CD data are maintained across sessions, it may be necessary to download a copy of the CD data from a D-server to an E-net. The signalling between the E-net and D- server may also be integrity protected and ciphered. A malfunctioning or malicious message compressor could cause the message decompressor to reconstitute SCRIBE messages that do not match the original messages but still have valid IP, UDP, and RTP headers and possibly also valid UDP checksums. Such corruption may be detected with end-to-end authentication and integrity mechanisms which will not be affected by the compression. Moreover, this signalling message compression scheme uses an internal checksum for verification of a reconstructed signalling message. This reduces the probability of producing decompressed signalling messages not matching the original ones without this being noticed. 7. Intellectual Property Right Considerations Nokia Corporation and/or its affiliates hereby declare that they are in conformity with Section 10 of RFC 2026. Nokia's contributions may contain one or more patents or patent applications. To the extent Nokia's contribution is adopted to the specification, Nokia undertakes to license patents technically necessary to implement the specification on fair, reasonable, and nondiscriminatory terms based on reciprocity. Liu, Le, Leung, and Clanton [Page 63] INTERNET-DRAFT SCRIBE 18 July 2001 8. References [COMPRES] J. Storer and T. Szymanski, "Data Compression via Textual Substitution", Journal of the Association for Computing Machinery, Vol. 29, No. 4, October 1982, pp. 928-951. [FLOWS] "Signalling Flows for IP Multimedia Call Control Based on SIP and SDP (Release 5)", 3GPP TS 24.228 V0.4.0, Technical Specification Group Core Network, 3rd Generation Partnership Project, March 2001. [ORDER] L. Lamport, "Time, Clocks, and the Ordering of Events in a Distributed System", Communications of the ACM, Vol. 21, No. 7, July 1978, pp. 558-565. [REQUIRE] K. Le, Z. Liu, C. Clanton, and K. Leung, "Requirements for Compression of Signaling Protocol Messages", draft-le- sigcompr-requirements-00.txt, Internet Draft, Network Networking Group, Internet Engineering Task Force, 7 June 2001. [RFC2326] H. Schulzrinne, A. Rao, and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, Network Working Group, Internet Engineering Task Force, April 1998. [RFC2327] M. Handley and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, Network Working Group, Internet Engineering Task Force, March 1999. [RFC2616] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, Network Working Group, Internet Engineering Task Force, June 1999. [RFC3095] C. Bormann, C. Burmeister, M. Degermark, H. Fukushima, H. Hannu, L.-E. Jonsson, R. Hakenberg, T. Koren, K. Le, Z. Liu, A. Martensson, A. Miyazaki, K. Svanbro, T. Wiebke, T. Yoshimura, and H. Zheng, "Robust Header Compression (ROHC): Framework and Four Profiles: RTP, UDP, ESP, and Uncompressed", RFC 3095, Network Working Group, Internet Engineering Task Force, July 2001. [SIGNAL] H. Hannu, J. Christoffersson, and K. Svanbro, "Application Signaling Over Cellular Links", draft-hannu-rohc-signaling- cellular-01.txt, Internet Draft, Network Networking Group, Internet Engineering Task Force, 2 March 2001. Liu, Le, Leung, and Clanton [Page 64] INTERNET-DRAFT SCRIBE 18 July 2001 [SIP] M. Handley, H. Schulzrinne, E. Schooler, and J. Rosenberg, "SIP: Session Initiation Protocol", draft-ietf-sip- rfc2543bis-03.txt, Internet Draft, Internet Engineering Task Force, 29 May 2001. Liu, Le, Leung, and Clanton [Page 65] INTERNET-DRAFT SCRIBE 18 July 2001 9. Authors' Addresses Zhigang Liu Nokia Research Center 6000 Connection Drive Irving, TX 75039 USA Phone: +1 972 894-5935 Fax: +1 972 894-4589 E-mail: zhigang.liu@nokia.com Khiem Le Nokia Research Center 6000 Connection Drive Irving, TX 75039 USA Phone: +1 972 894-4882 Fax: +1 972 894-4589 E-mail: khiem.le@nokia.com Ka Cheong Leung Nokia Research Center 6000 Connection Drive Irving, TX 75039 USA Phone: +1 972 374-0630 Fax: +1 972 894-4589 E-mail: kacheong.leung@nokia.com Christopher Clanton Nokia Research Center 6000 Connection Drive Irving, TX 75039 USA Phone: +1 972 894-4886 Fax: +1 972 894-4589 E-mail: christopher.clanton@nokia.com Liu, Le, Leung, and Clanton [Page 66] INTERNET-DRAFT SCRIBE 18 July 2001 Appendix A. Performance Results A.1. Introduction In the text below, compression of messages exchanged between the UE and P-CSCF in 'MO#2 scenario' described in [COMPRES] is studied, followed by call termination (BYE). UE and P-CSCF correspond to the mobile station and SIP proxy respectively in the 3GPP architecture [FLOWS]. MO#2 is a scenario where a call comes into the mobile station. Two different compression mechanisms were considered: 1. Simple LZSS [COMPRES] message compression a) With (12-bit position, 4-bit match length = 16 bits total) b) With (11-bit position, 5-bit match length = 16 bits total) 2. Linux GZIP compression For LZSS, we experimented with the tradeoff between allocating more bits to represent either the position indicator or more bits to the match length, while keeping the total sum of bits to represent each constant. More bits for the position indicator means that a larger CD data store can be used (it is basically a pointer mapped into the CD data address space). More bits for the match length means that it will be possible to match longer strings in the CD data compared to if fewer bits were used. In each case, we looked at the message exchange associated with a mobile-originated call. Three different compression/decompression scenarios, A, B, and C, were tested Scenario A: Each message is compressed individually, meaning that the CD data is used only for the current message (it is emptied at the end of each message). Scenario B: The profile-specific CD data is assumed to be present; the CD data is populated with the contents of the SIP INVITE message minus those fields likely to change from one call to the next (Remote-Party-ID, To, and Call-ID fields); no dynamic updating of the CD data is assumed to take place. Liu, Le, Leung, and Clanton [Page 67] INTERNET-DRAFT SCRIBE 18 July 2001 Scenario C: The profile-specific CD data and static CD data are both assumed to be present; the profile-specific CD data has the same contents as specified for Scenario B, while the static CD data contains the SIP literals (i.e. header field names) in the SIP messages exchanged over the air interface; again, no dynamic updating of the CD data is assumed. A.2. Result Summary The results represent the performance obtainable when fairly simple enhancements to how the CD data is initialized are incorporated. It could be possible to improve performance further if additional techniques are applied. The numbers include the SIP compression/decompression header overhead defined in this document. However, they do not take into account the underlying transport headers (e.g. UDP/IP overhead). It is likely that the underlying transport headers can be compressed down to the order of a few bytes, and thus negligible compared to the compressed messages. Table 1 shows the average compression ratio in each cases. Table 1. The Average Compression Ratio in Different Scenarios. Scenario A, Scenario B, Scenario C, Compression Ratio Compression Ratio Compression Ratio (Average) (Average) (Average) ------------------------------------------------------------------------ LZSS, 12-bit position, 4-bit 1.31 4.19 4.51 match length LZSS, 11-bit position, 5-bit 1.33 4.87 5.27 match length Linux GZIP compression 1.47 6.23 6.37 The results suggest that as a whole, it is feasible to achieve a relatively high compression ratio with a pretty simple solution like merely using the static and profile-specific CD data, which are populated ahead of time. However, the ratio can be further boosted by the inclusion of the dynamic CD data. Liu, Le, Leung, and Clanton [Page 68] INTERNET-DRAFT SCRIBE 18 July 2001 As evidenced by the results, for the basic LZSS compression, it is possible to improve performance by allocating more bits to the matched length instead of increasing the size of the CD data store. This is likely a consequence of the fact that there is a tendency for longer strings to be repeated in SIP messages. A.3. Detailed Results Following is more detailed data (message-by-message) when mechanism 1b) is used. First the compression software configuration: 1) Compression software: lzss.c 2) Size of ring buffer (i.e. the sliding CD data store) = 2048 bytes 3) Lower limit for match length = 2, i.e. strings with length of 2 or 1 byte will not be encoded to (position, length) pair. 4) Upper limit for match length = 34. 5) Each pair (position, length) is 2-byte long, 11 bits for position (2^11 = 2048) and 5 bits for length (note: due to the lower limit of match length, the range [0, 31] maps to [3, 34]). 6) Need one extra bit to indicate if it is a (position, length) pair or uncompressed byte. Thus, the cost for each element is either 9 bits (uncompressed) or 17 bits (compressed). Scenario A: Messages are compressed individually. The CD data is not maintained across messages. Liu, Le, Leung, and Clanton [Page 69] INTERNET-DRAFT SCRIBE 18 July 2001 No. Content Uncompressed Compressed Ratio Size (Bytes) Size (Bytes) ----------------------------------------------------- 01 INVITE 786 597 1.32 02 100 TRYING 324 243 1.33 11 183 SDP 786 633 1.24 12 PRACK 658 493 1.33 17 200 OK 353 267 1.32 19 COMET 634 470 1.35 24 200 OK 353 267 1.32 27 180 RINGING 322 236 1.36 28 PRACK 402 289 1.39 33 200 OK 353 267 1.32 37 200 OK 353 267 1.32 38 ACK 376 263 1.43 51 BYE 354 262 1.35 52 200 OK 317 235 1.35 ------------------------------------------------------ Average 6371 4789 1.33 Scenario B: The size of the profile-specific CD data is 640 bytes. No. Content Uncompressed Compressed Ratio Size (Bytes) Size (Bytes) ----------------------------------------------------- 01 INVITE 786 85 9.25 02 100 TRYING 324 57 5.68 11 183 SDP 786 213 3.69 12 PRACK 658 116 5.67 17 200 OK 353 88 4.01 19 COMET 634 116 5.47 24 200 OK 353 88 4.01 27 180 RINGING 322 69 4.67 28 PRACK 402 96 4.19 33 200 OK 353 89 3.97 37 200 OK 353 89 3.97 38 ACK 376 80 4.70 51 BYE 354 58 6.10 52 200 OK 317 65 4.88 ------------------------------------------------------ Average 6371 1309 4.87 Scenario C: The size of the static CD data is 243 bytes. The profile-specific CD data goes before the static one in the combined CD data and is defined as the previous scenario. The total size of Liu, Le, Leung, and Clanton [Page 70] INTERNET-DRAFT SCRIBE 18 July 2001 the CD data is 883 bytes. No. Content Uncompressed Compressed Ratio Size (Bytes) Size (Bytes) ----------------------------------------------------- 01 INVITE 786 87 9.03 02 100 TRYING 324 51 6.35 11 183 SDP 786 188 4.18 12 PRACK 658 109 6.04 17 200 OK 353 80 4.41 19 COMET 634 112 5.66 24 200 OK 353 80 4.41 27 180 RINGING 322 57 5.65 28 PRACK 402 89 4.52 33 200 OK 353 81 4.36 37 200 OK 353 81 4.36 38 ACK 376 78 4.83 51 BYE 354 57 6.21 52 200 OK 317 59 5.37 ------------------------------------------------------ Average 6371 1209 5.27 Appendix B. Examples on CD Data Management Signalling B.1. Consistent Dynamic CD Data Maintenance In split mode, both CD entities, i.e. CD Entities X and Y, can simply use the symmetric sliding handshake mechanism, i.e. a three-way handshake algorithm, without the need to maintain a total order relationship of all CD data update requests to synchronize CD data updates, since the partial order relationship among CD data updates initiated by the same CD entity is sufficient to guarantee consistent CD data updates to its own part of the CD data. Such partial order relationship can be inferred directly from the sequence numbers of SCRIBE messages containing such update requests. Thus, race condition on CD data updates will not happen in performing a CD data update at both CD entities. For a CD data update in shared mode, a CD entity cannot simply rely on the state of a CD data update request based on the sliding handshake algorithm, since a sliding handshake algorithm simply Liu, Le, Leung, and Clanton [Page 71] INTERNET-DRAFT SCRIBE 18 July 2001 convey a signal from one CD entity to another CD entity that an action, say a CD data update, is to be performed. It may come up with inconsistent CD data updates at both CD entities, whenever updates are concurrently issued by both CD entities. An example that illustrates inconsistent CD data updates is shown in Figure 12. When the CD data is operated in the shared CD data mode and the sliding handshake is used for CD data updates, all eligible messages are used to update the dynamic part of the CD data. Suppose all messages exchanged between CD Entity X and CD Entity Y are employed for subsequent compression/decompression, and all received messages are acknowledged and piggybacked in all subsequent messages sent by the receiver. Messages 1, 2, and 3 are sent from CD Entity X concurrently with Messages 101, 102, and 103 respectively from CD Entity Y. CD Entity X will use Message 1 from the temporary sender storage to update the dynamic part of the CD data store when it receives Message 102, which contains an acknowledgement from CD Entity Y for the reception of Message 1. Similarly, CD Entity Y will use Message 101 from the temporary sender storage to update the dynamic part of the CD data when it receives Message 2, which contains an acknowledgement from CD Entity X for the reception of Message 101. However, CD Entity X and CD Entity Y have not yet used Messages 101 and 1 respectively to update the dynamic part of the CD data. Since new items are introduced to the dynamic part of the CD data (possibly with a deletion of some existing items), a consistent order for updating the CD data at both CD entities is required. In addition, such inconsistent view of the CD data can result in a failure in decompressing a compressed message. Thus, this example demonstrates that an inconsistent CD data update has occurred because the sliding handshake itself fails to provide concurrency control for CD data updates. The problem can be eliminated by requiring the maintenance of a total order relationship for a sequence of all known CD data update requests at each CD entity. Liu, Le, Leung, and Clanton [Page 72] INTERNET-DRAFT SCRIBE 18 July 2001 | | DD = {} | | DD = {} TSS = {1} | | TSS = {101} TRS = {} | | TRS = {} | [1] [101] | |--------------\ /--------------| | \/ | | /\ | |<-------------/ \------------->| | | DD = {} | | DD = {} TSS = {1, 2} | | TSS = {101, 102} TRS = {101} | | TRS = {1} | [2] [102] | |--------------\ /--------------| | \/ | | /\ | |<-------------/ \------------->| | | DD = {1} | | DD = {101} TSS = {2, 3} | | TSS = {102, 103} TRS = {101, 102} | | TRS = {1, 2} | [3] [103] | |--------------\ /--------------| | \/ | | /\ | |<-------------/ \------------->| | | DD = {1, 101, 2} | | DD = {101, 1, 102} TSS = {3} | | TSS = {103} TRS = {102, 103} | | TRS = {2, 3} | | CD Entity X CD Entity Y Figure 12. An Example of an Inconsistent CD Data Updates Using Symmetric Sliding Handshake without Total Order Relationship. An example showing how the scheme with the total order relationship solves the inconsistent CD data update problem exhibited in Figure 12 is illustrated in Figure 13. Suppose all messages exchanged between CD Entity X (a master CD entity) and CD Entity Y (a slave CD entity) are employed for subsequent compression/decompression, and all received messages are acknowledged and piggybacked in all subsequent messages sent by the receiver. Both CD entities maintain their own total order relationships locally according to the specification of Liu, Le, Leung, and Clanton [Page 73] INTERNET-DRAFT SCRIBE 18 July 2001 ordering constraints, and they update the dynamic part of the CD data based on the CD data update mechanism mentioned in Section 3.2.3. Whenever concurrent dictionary update are happened, update requests initiated from the master CD entity are ordered first. For example, the update requests for Messages 1, 2, and 3 are ordered before those for Messages 101, 102, and 103 respectively in the list of the total order relationship for a sequence of dictionary updates (TOL). Since the update request for Message 1 is ordered before that for Message 101 according to the total order relationship, the update request for Message 101 at CD Entity Y is blocked until that for Message 1 becomes ready to be moved, as indicated by the reception of the UPDATE_ACK_ACK piggybacked with Message 3. Liu, Le, Leung, and Clanton [Page 74] INTERNET-DRAFT SCRIBE 18 July 2001 | | DD = {} | | DD = {} TSS = {1} | | TSS = {101} TRS = {} | | TRS = {} TOL = {1} | | TOL = {} | [1] [101] | |--------------\ /--------------| | \/ | | /\ | |<-------------/ \------------->| | | DD = {} | | DD = {} TSS = {1, 2} | | TSS = {101, 102} TRS = {101} | | TRS = {1} TOL = {1, 101, 2} | | TRS = {1} | [2] [102] | |--------------\ /--------------| | \/ | | /\ | |<-------------/ \------------->| | | DD = {1} | | DD = {} TSS = {2, 3} | | TSS = {101 - 103} TRS = {101, 102} | | TRS = {1, 2} TOL = {101, 2, | | TOL = {1, 101, 2} 102, 3} | | | [3] [103] | |--------------\ /--------------| | \/ | | /\ | |<-------------/ \------------->| | | DD = {1, 101, 2} | | DD = {1, 101} TSS = {3} | | TSS = {102, 103} TRS = {102, 103} | | TRS = {2, 3} TOL = {102,3,103} | | TOL = {2, 102, 3} | | CD Entity X CD Entity Y Figure 13. An Example of Proposed Scheme with Total Order Relationship for CD Data Updates. Liu, Le, Leung, and Clanton [Page 75] INTERNET-DRAFT SCRIBE 18 July 2001 B.2. Dynamic CD Maintenance Under Unreliable Path Conditions Suppose all messages exchanged between CD Entity X (a master CD entity) and CD Entity Y (a slave CD entity) are employed for subsequent compression/decompression, and all received messages are acknowledged and piggybacked in all subsequent messages sent by the receiver. Both CD entities maintain their own total order relationships locally according to the specification of ordering constraints, and they update the dynamic part of the CD data based on the dictionary update mechanism described in the draft. Whenever concurrent CD data update conditions are detected, update requests initiated from the master entity are ordered first. An example showing how the CD data update scheme works with the checkpoint and rollback procedures is illustrated in Figure 14. A checkpoint initiation CHKPT_INIT is piggybacked with Message 1, which is then sent to CD Entity Y. The checkpoint is CCD in this example. When CD Entity Y receives Message 1, it copies the current content of DD, i.e. CCD, as the checkpoint CD data. Once finished, CD Entity Y sends a checkpoint acknowledgement CHKPT_ACK, which is piggybacked with Message 102, to CD Entity X. CD Entity Y initiates a rollback some time later by sending a rollback initiation ROLL_INIT, which is piggybacked with Message 103, to CD Entity X. Once CD Entity X receives the request, it updates its DD as CCD and replies with a rollback acknowledgement ROLL_ACK. The ROLL_ACK is piggybacked with Message 3 to inform CD Entity Y that the request is accepted. Liu, Le, Leung, and Clanton [Page 76] INTERNET-DRAFT SCRIBE 18 July 2001 | | DD = {CCD} | | DD = {CCD} TSS = {1} | | TSS = {101} TRS = {} | | TRS = {} TOL = {1} | | TOL = {} | [1 CHKPT_INIT] [101] | |--------------\ /--------------| | \/ | | /\ | |<-------------/ \------------->| | | DD = {CCD} | | DD = {CCD} TSS = {1, 2} | | TSS = {101, 102} TRS = {101} | | TRS = {1} TOL = {1, 101, 2} | | TOL = {1} | [2] [102 CHKPT_ACK] | |--------------\ /--------------| | \/ | | /\ | |<-------------/ \------------->| | | DD = {CCD, 1} | | DD = {} TSS = {2} | | TSS = {103} TRS = {101, 102} | | TRS = {} TOL = {101,2,102} | | TOL = {} | [103 ROLL_INIT] | |<-------------------------------| | | DD = {CCD} | | DD = {} TSS = {3} | | TSS = {103, 104} TRS = {103} | | TRS = {} TOL = {103, 3} | | TOL = {} | [3 ROLL_ACK] [104] | |--------------\ /--------------| | \/ | | /\ | |<-------------/ \------------->| | | DD = {CCD} | | DD = {CCD, 103} TSS = {3} | | TSS = {104, 105} TRS = {103, 104} | | TRS = {3} TOL = {103,3,104} | | TOL = {3} | [105] | |<-------------------------------| | | DD = {CCD,103,3} | | TSS = {} | | TRS = {104, 105} | | Liu, Le, Leung, and Clanton [Page 77] INTERNET-DRAFT SCRIBE 18 July 2001 TOL = {104, 105} | | | | CD Entity X CD Entity Y Figure 14. An Example of CD Data Updates With a Checkpoint and a Rollback. An example showing how the CD data update scheme works with message misordering is illustrated in Figure 15. Suppose Message 2 is delayed in transit such that it arrives at CD Entity Y after Message 3 has arrived. We can see that the scheme as when Message 3 arrives CD Entity Y, CD Entity Y knows that Message 2 is missing as inferred from the gap in the set of sequence numbers of all arrived messages. It then starts the S-timer till either the timer expires or Message 2 arrives. In this example, the S-timer stops as Message 2 eventually arrives at CD Entity Y. The total order relationship can be maintained and thus the CD data update mechanism works. Liu, Le, Leung, and Clanton [Page 78] INTERNET-DRAFT SCRIBE 18 July 2001 | | DD = {CCD} | | DD = {CCD} TSS = {1} | | TSS = {} TRS = {} | | TRS = {} TOL = {1} | | TOL = {} | [1] | |------------------------------->| | | DD = {CCD} | | DD = {CCD} TSS = {1, 2} | | TSS = {101} TRS = {} | | TRS = {1} TOL = {1, 2} | | TOL = {1} | [2 CHKPT_INIT] [101] | |--------------\ /--------------| | \/ | | /\ | |<-------------/ \ | | | | DD = {CCD, 1} | | | DD = {CCD} TSS = {2, 3} | | | TSS = {101, 102} TRS = {101} | | | TRS = {1} TOL = {2, 101, 3} | | | TOL = {1} | [3] | [102] | |--------------\ /)-------------| | \/ | | | /\ | | |<-------------/ \)------------>| | | | DD = {CCD, 1} | | | DD = {CCD, 1} TSS = {2, 3} | | | TSS = {101, 102} TRS = {101, 102} | | | TRS = {3} TOL = {2, 101, 3, | | | TOL = {(2), 101, 3} 102} | | | | \------------>| | | | | DD = {CCD, 1} | | TSS = {101, 102} | | TRS = {2, 3} | | TOL = {2, 101, 3} | [103 CHKPT_ACK] | |<-------------------------------| | | DD = {CCD, 1, 2, | | 101, 3} | | TSS = {} | | TRS = {102, 103} | | TOL = {102, 103} | | | | Liu, Le, Leung, and Clanton [Page 79] INTERNET-DRAFT SCRIBE 18 July 2001 CD Entity X CD Entity Y Figure 15. An Example of CD Data Updates with Packet Misordering. An example showing how the CD data update scheme works with message losses is illustrated in Figure 16. In this example, Messages 1 and 101 are lost in transit. The correctness of the proposed scheme is unaffected as acknowledgements for the reception of user and control messages (such as CHKPT_ACK and ROLL_ACK) are piggybacked to each subsequent outgoing messages by the sender until the sender knows that the receiver has already received properly, that can be inferred from the acknowledgement of the reception of these outgoing messages. Liu, Le, Leung, and Clanton [Page 80] INTERNET-DRAFT SCRIBE 18 July 2001 | | DD = {CCD} | | DD = {CCD} TSS = {1} | | TSS = {} TRS = {} | | TRS = {} TOL = {1} | | TOL = {} | [1] | |------->X | | | DD = {CCD} | | TSS = {1, 2} | | TRS = {} | | TOL = {1, 2} | | | [2 CHKPT_INIT] | |------------------------------->| | | | | DD = {CCD} | | TSS = {101} | | TRS = {2} DD = {CCD} | | TOL = {2} TSS = {1, 2, 3} | [101 CHKPT_ACK] | TRS = {} | X<----------------| TOL = {1, 2, 3} | | | [3] | |------------------------------->| | | | | DD = {CCD} | | TSS = {101, 102} | | TRS = {2, 3} | | TOL = {2, 3} | [102 CHKPT_ACK] | |<-------------------------------| | | DD = {CCD, 2, 3} | | TSS = {} | | TRS = {102} | | TOL = {(101),102} | | | | CD Entity X CD Entity Y Figure 16. An Example of CD Data Updates with Packet Losses. Liu, Le, Leung, and Clanton [Page 81] INTERNET-DRAFT SCRIBE 18 July 2001 Appendix C. Application to SIP Signalling for Cellular Radio Application to Cellular Radio Systems has been mentioned as a potential application of SCRIBE. In particular, we describe here a realistic application of SCRIBE to the case of compressing SIP call control messages in cellular radio environments. SIP has been selected as the call control for 3GPP (3rd Generation Partnership Program), a cellular standardization body defining 'All-IP' cellular system operation when the radio link is based on WCDMA or GSM/EDGE. It is being considered for use in other radio systems, such as CDMA2000, as well. C.1. Why SIP Compression for Cellular? Transport of the SIP messages has been identified as a significant issue for 3GPP cellular systems because the size of SIP messages involved in typical call control sequences is much larger than the message size of analogous signalling used in classic cellular. The increased message size means that: - Call control procedures when using SIP will take much more time to be completed compared to using existing cellular-specific signalling; this means that the end user will experience a delay in call establishment which will be unexpected (and likely unacceptable) - Intra-call signalling will in some way adversely affect voice quality and system performance. SIP message transmission might be done by 1) "stealing" frames from the existing radio bearer (e.g., blank and burst) which may be carrying voice -- the result is lower voice quality when compared with existing cellular radio; 2) delaying transmission of voice that might otherwise be transmitted immediately on the radio bearer -- this is most likely intolerable, in particular for real time flows, as they normally have strict delay requirements; 3) using additional radio resources -- this can negatively impact the overall system performance, e.g. reduce capacity and/or increase the interference levels experienced within the cellular system. In general, cellular operators want to achieve comparable spectral efficiency for VoIP as is obtained when existing cellular (based on circuit switched technology) is used. Any inefficiencies equate to lost revenue for the operator, and probably a less attractive service (either in Liu, Le, Leung, and Clanton [Page 82] INTERNET-DRAFT SCRIBE 18 July 2001 terms of cost or quality) for the end user. Organizations such as 3GPP TSG GERAN (which is defining specifications for the evolution of GSM/EDGE Radio access networks) have already identified possible options for SIP signalling transport. But it is widely known that these options will exhibit poor performance unless the SIP signalling is not first compressed before being sent over the radio link. C.2. Cellular Environment Assumptions As explained in the main body of the draft, SCRIBE is a framework for message compression -- one can view it as a "toolbox" of mechanisms, some of which are more appropriate to be used in certain circumstances. In particular for cellular, there are certain system/device characteristics that imply use of some SCRIBE mechanisms over others. In this section, we describe some of these characteristics. In the below, we assume that compressor/decompressor functionality is located on both "sides" of the radio link. I.e. both the E-ms (for Mobile Station CD Entity) and the E-net (for Network Side CD Entity) have such functionality. During the course of one session, we assume that E-net does not change -- it means that e.g. cellular handover doesn't require a change of the compressor/decompressor used on the network side. However, E-net could change (e.g. from "E-net i" to "E-net j") if the E-ms "roams" to another location such that it no longer has direct access to E-net i, but does have access to E-net j. We also assume that: - SIP registration occurs when the E-ms roams to a different E-net. - The E-ms is associated with exactly one E-net at any given time. - Each E-net has access to a D-Server, which contains at least the S-profile associated with the E-ms. This kind of information kept in the D-server varies (as described in the main document), and depends on the mode of operation, information flow between E-net and D-server may be download only, or upload and download. The architecture assumed is illustrated below: Liu, Le, Leung, and Clanton [Page 83] INTERNET-DRAFT SCRIBE 18 July 2001 |=======| | |========| | | Radio Interface | | | | E-ms | <===========================> | E-net |\ | | | | 1 | \ |=======| | |========| \ | . \ | . \ | . \ | . \ | . \ | . |-------------| | . | D-Server | | . |-------------| | . / / | |========| / / | | |/ / | | E-net | / | | n-1 | / | |========| / | / | |========|/ | | | | | E-net | | | n | | |========| |================================ C.3. Cellular Environment Limitations The mobile station used in cellular may have limitations which will impact our choice of mechanisms from the SCRIBE framework. Some of these are described below: - Memory Limitations of E-ms: It is possible that the mobile will have limited memory resources to use for e.g. storing CD data. - CPU Limitations of E-ms: It is possible that the mobile will have limited processing capability. - Bandwidth Limitations of the transport between compressor and decompressor in E-ms and E-net: It is possible that there could be a low bandwidth radio link between E-net and E-ms. - It may be undesirable or otherwise not possible to use the radio link for simultaneous heterogeneous traffic (e.g. simultaneous transmission of user media and SIP signalling). The assumption Liu, Le, Leung, and Clanton [Page 84] INTERNET-DRAFT SCRIBE 18 July 2001 comes from a desire to minimize mobile station complexity and/or minimize mobile station radio resource utilization (bandwidth in cellular systems is scarce). Some "high-end" mobile stations may have only a subset (or none) of these limitations. C.4. Possible Application of SCRIBE for Cellular The below represents simple example applications of SCRIBE to SIP compression for cellular. There are many variations/profiles possible. Below, we first describe some generic behaviour likely applicable to all cellular terminals. Then, two scenarios are evaluated: 1) Simple Terminal scenario; and 2) High-Performance Terminal scenario. An E-ms is first populated with the static part of the CD data as per some specification. E-net devices are populated with identical information. Alternatively, the static part can be downloaded from the D-server, see below. The data can be stored in permanent or semi-permanent memory, and can be loaded long before the E-ms/net are purchased and deployed. When the E-ms is purchased and put into service the profile-specific part of the CD data is loaded into semi-permanent memory in the E-ms and in the D-server, making them directly accessible by all E-net devices. This procedure can be repeated many times if desired, but it likely occurs on an infrequent basis. E.g., there may be a need to change the information if E-ms capabilities are changed during its lifetime, if some part of the identity data ("phone number") changes, or if the user service profile is modified. SCENARIO A: Simple Terminal In this case, it is recommended for simplicity that only the static and profile-specific parts of the CD data are used to compress/decompress. In other words, the CD data only consists of the static and profile-specific parts, and no dynamic or temporary parts are used. No CD data update is needed (value of CD Data Management Signalling attribute in S-profile is 1 or 2). This selection is driven by the memory/CPU/bandwidth-handling limitations that are probably associated with a simple terminal device. Liu, Le, Leung, and Clanton [Page 85] INTERNET-DRAFT SCRIBE 18 July 2001 When the E-ms "powers on", we assume that SIP registration takes place. At that time, the E-net will download the E-ms S-profile from the D-server. The E-ms and E-net should confirm that they have the same S-profile using state verification mechanisms described in the body of the draft. CD Algorithm attribute in the S-profile has value 3. At the end of the session, there is no uploading of information to the D-server. SCENARIO B: High-Performance Terminal In this case, it is recommended that the dynamic and temporary parts of the CD data are also used to boost the compression performance. In other words, the CD data includes a dynamic part, and a temporary part are used for compression/decompression too. This means the incorporation of the sliding handshake, checkpoint, rollback, and other CD data management signalling concepts described in Section 2.5. CD data is regularly updated, resulting in a more effective compression than the simple terminal case. The value of the CD Data Management Signalling attribute in the S-profile is 4 or 5. When the E-ms "powers on", we assume that SIP registration takes place. At that time, the E-net will download the E-ms S-profile, along with a dynamic part, from the D-server. The E-ms and E-net should confirm that they have the same data using the state verification mechanisms. CD Algorithm attribute in the S-profile has value 3. The CD data update algorithm attribute is determined by the S-profile information downloaded from the D-server. At the end of a session, a checkpoint is established, and the checkpoint CD data is uploaded by the E-net to the D-server. It is worth noting again that these two cases represent relatively extreme scenarios -- there can be many variations of terminals that fall in between these two cases. The SCRIBE framework is flexible enough to optimize whichever aspect is most critical (memory, processing, performance). Liu, Le, Leung, and Clanton [Page 86]