ROHC Ishita Majumdar Valerie Kenneally Dirk Pesch Motorola Inc. Internet Draft Cork Institute of Technology Document: draft-ziyad-rohc-tccb-01.txt October 2002 Category: Draft Text-based Compression Using Cache and Blank Approach (TCCB) Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsolete by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document defines an efficient, robust, and scalable scheme for the compression of text-based protocols. Such protocols (e.g. SIP, SDP) when sent uncompressed over limited bandwidth networks such as cellular, dial-up Internet or the upstream of Hybrid Fiber Coax (HFC) cause inadvertent delays in call set up. TCCB addresses these problems and proposes a mechanism to reduce the size of the messages and hence the recurrent delay. 1. Introduction 2 2. Basic Framework of TCCB 2 2.1 Assumptions 3 2.2 TCCB Functions 3 2.3 Algorithm 4 3. Examples 5 4. Requirements 20 5. Evaluation 22 6. Conclusion 23 7. Intellectual Property Rights Consideration 23 8. References 23 9. AuthorÆs Addresses 23 1.0 Introduction Telephony service today is provided for the most part over circuit switched networks. The new trend that is fast emerging is to provide telephony service over IP networks known as IP telephony. The motivating factors for carrying voice traffic over data networks are the integration of voice and data applications, which can result in more effective business process, cost savings for voice calls, enabling of many new services for business and customers. The flexibility offered by IP telephony by moving the intelligence from the network to the end stations enables many new services, which did not exist before. In order to merge Internet and cellular telephony, two aspects had to be focused on, the end-to-end call set up delay and the voice quality. Protocols such as SIP[1] and SDP[2] will be typically used to set up and tear down sessions. However, the problem of adopting ASCII based protocols in access networks of limited bandwidth, incurs a huge delay for call set up. Large text messages being passed over the air interface also result in a very inefficient use of the transmission medium. In addition, some legacy based enhanced TDM cellular transceivers such as GERAN (GSM EDGE Radio Access Network) will need to "steal" audio bandwidth for the transmission of inter-call SIP signalling messages resulting in long audio mutes. Compression of such messages is therefore required in order to increase spectrum efficiency, reduce transmission delay and to provide a comparable level of quality of service compared with circuit switched systems. Section 2 provides a high-level overview of the techniques. Section 3 provides a more detailed explanation of the techniques introduced in section 2. Section 4 explains the TCCB's adherence to draft requirements for signalling compression. Section 5 contains a table of message sizes from the examples in section 3 followed by a conclusion statement in section 6. 2.0 Basic Framework of TCCB TCCB is designed to be extensible. It can work over various access technologies and the principle applies to all text based protocols. The added advantage of this method is that only the User Agents and the Peer Core Network Entity (e.g. Proxy CSCF) need be involved in the storage and retrieval of information to compress or decompress the text based messages. The purpose of the TCCB layer is to remove all redundant header information and redundant payload information wherever applicable. It indexes and caches this information in the local memory for future decompression purposes. TCCB is contained within a new æshimÆ layer that sits between the text- based application protocol layer (e.g. SIP,SDP) and the lower transport layers (e.g.UDP,TCP). This additional layer would be added to the stacks of both entities wishing to communicate using compression (e.g. in the UE and the P-CSCF in the case of 3GPP Rel5 IP Multimedia architecture). 2.1. Assumptions When applied to SIP messages, the algorithm at a compressor endpoint (e.g. Proxy CSCF) takes as input a SIP message to be compressed and pre-cached messages (except for first message compression after power up), and produces as output a compressed SIP message. The compressor further caches the original message in the caching table indexed by a unique identifier. The decompressor (e.g. the UE) takes a compressed message and pre-cached messages as input and produces as output a decompressed message based on the same unique identifier. Both the original and decompressed messages are identical. 2.2 TCCB Functions The purpose of the TCCB layer is to remove all redundant header information. Redundant header information is identified as header contents, which the sending node "knows" is already stored in the cache of the receiving node. An example in the SIP case is where a header fieldÆs contents have been encountered in a message sent previously. +---------------+ +---------------+ | APPLICATION | | APPLICATION | | LAYER | | LAYER | | (Text based | | (Text based | | Protocol) | | Protocol) | +---------------+ +---------------+ | TCCB | | TCCB | | | | | +---------------+ +---------------+ | UDP/TCP | | UDP/TCP | +---------------+ +---------------+ | IP | | IP | +---------------+ +---------------+ | Physical | +----------+ | Physical | | |<- - > | Physical | < - -> | | +---------------+ | Channel | +---------------+ +----------+ Figure 1-TCCB Architecture If the TCCB layer sees that a header field is the same as previously sent or received for a particular sequence, it simply blanks or removes the header contents. If the TCCB layer receives a message (or method in SIP) with a "blank" header field, it reconstructs the header from its cache or pre-loaded dictionaries as appropriate for that particular sequence. 2.3 Algorithm: The following is the basic algorithm of TCCB. For ALL messages, (a) determine message characteristics (b) extract index information from the TO and FROM fields within the message. (c) utilise the preloaded static header, method and default content dictionaries Messages for COMPRESSION BEGIN Perform compression on each message element within message using the index information from (b), extract element from cache IF message element matches element stored in cache THEN Blank message element contents in message ELSE Using appropriate index information, update cache with the latest element information END Forward compressed message to transport layer (e.g.UDP/TCP) for processing Messages for DECOMPRESSION BEGIN Perform decompression on each message element within message IF message element has been blanked THEN Using index information from (b), extract element from cache to reconstruct the message element ELSE Using appropriate index information, ensure element in cache is up-to-date with the received message element END Forward reconstructed message to application layer Notes: 1) All TCCB layers contain compressor and decompressor functionality. 2) The algorithm requires that TCCB be aware of the text-based-protocol that it is compressing. For example, when applied to SIP and SDP, all of the preloaded dictionaries are specific to these two text-based protocols. 3) Multiple Indexing: The algorithm uses the From and To fields of a message to act as indices for the storage and retrieval of messages in the cache. The FROM index is the primary index. 4) It is NOT necessary to send mandatory message element names. When the contents of a particular message is to be blanked and the header of that particular line is mandatory, then the entire line is blanked or compressed. The receiving TCCB (decompressor) would search for missing mandatory message headers, and reconstruct these based on information from its cache or the pre-loaded dictionaries. 5) Error Checking and Error Handling: In the current algorithm as described above, this aspect is not addressed. This is an area for further study. 6) Deletion Mechanism: In order to avoid memory leaks either at the TCCB in the Client or Server a mechanism is required to free-up memory. A certain amount of memory will be updated by overwriting existing contents. However, certain information such as that specific to a particular call-leg may need to be deleted when TCCB recognizing messages which indicate the tear down of that particular leg. Timer based clean-up may also be required to ensure that hashing delays are kept to a minimum. Research and development is currently on going in this area 2.3.1 Dictionaries The current TCCB algorithm implements the usage of a number of pre-loaded dictionaries. These pre-loaded dictionaries are inclusive of the TCCB layer. At present, the TCCB compression algorithm incorporates five dictionaries. These being a) Method/Static SIP Header b) Mandatory SIP/SDP Header c) Default Content a) Method The method/static SIP header dictionary contains an 8-bit representation of the complete list of SIP methods and header names as defined in RFC 3261. E.g. ACK 0x000158 BYE 0x000258 Thus, when sending the compressed message, ôACKö would be replaced with 0x000158 etc. b) Mandatory SIP/SDP Header This dictionary contains a complete list of mandatory headers specified in RFC 3261[3]. The TCCB compressor compresses the messages by removing from the message, the mandatory SIP/SDP headers whose contents have already been compressed. At the receiver side, the decompressor examines the compressed message and determines which headers were blanked by comparing with the mandatory header dictionary. Any mandatory header that is not present i.e. is compressed completely, is inserted into the decompressed message and the appropriate contents are decompressed from the cache. c) Default Content Dictionary Examination of SIP/SDP messages has identified message contents, which are repeated with high frequency. These message elements are listed in the default content dictionary. The compressor, in conforming to the default content dictionary can eliminate these contents even on the first message of a session thereby reducing the size of even the very first message. An asterisk æ*Æ is inserted in place of any contents that are compressed using the default content dictionary. This enables the receiving decompressor to recognise when message contents should be reconstructed using the default content dictionary. 2.3.2 Indexing and Cache As stated previously for both compressed and uncompressed messages, indices are extracted from the message content. Two indices are established per message. 1: From Index 2: To Index Both indices are determined using the To and From lines of a message. In both cases the SIP address contained in the lines of the To and From Header lines is extracted and forms the respective To and From Indices for that message. The From Index is always the primary index. The User Equipment will have just one cache storing all its data for compression/decompression, whereas the proxy CSCF must hold a cache for each UE it is communicating with. Each cache is divided into tables with each table corresponding to a message header and stores the headers data field. The FROM index is vital to the decompressor in order to access the correct cache of the communicating UE upon receiving a compressed message. When caching data for the first time the FROM index is always used. When compressing a message, the cache is firstly examined by determining if a cache table for the given header has been created previously. If such a headerÆs cache table is present then the number of indices in the cache table is determined. When a UE sends a message its FROM index will always remain the same but the TO index is dependant on the recipient of the message. Hence, within a header cache table, the FROM index plus numerous TO indices may be present indexing different data as it is encountered in messages with different destination addresses. 2.3.2.1 Single Index in Cache In this instance the FROM index alone exists in the given headers table in the cache. A single check between message and From indexed cache data is made. If a match is found then the data in the message is blanked. If a match is not found then the cache is updated depending on whether the header of the message line in question is a multiple or non-multiple header within that message. 2.3.2.2 Multiple Indices in Cache 1.Non-matching TO indices and Non-matching FROM Indexed Data If there is more than the FROM index present in a headers cache table, then a search is first made to find if any of the TO indices in the cache table match the TO index of the message. If matching TO indices are not found then the FROM indexed contents are checked. If there is no match the cache is updated with the new data from the message using the TO index of the message. 2.Matching TO indices If matching TO indices are found then a comparison is made between the contents of the matching TO index and the message contents. If they match then the message contents are blanked. If the contents do not match then the FROM indexed contents are checked. Again, if there is a match the message contents are blanked. If there is no match, the cache is updated depending on whether the header of the message line in question is a multiple or non-multiple header within that message. 2.3.3 Compression of Multiple/Non-Multiple Headers During compression and caching, the repeated occurrence of a header within a message is treated differently to those headers occurring only once. i) Non-Multiple Header For compression of non-multiple header data, the procedure is as follows. 1. Matching TO index, Matching Contents. If the TO indices match and the cache indexed data and the message contents match, then the message contents are blanked. 2. Matching TO index, Non-Matching Contents If the TO indices match but the indexed data in the cache does not match the message contents, then the cache is updated with the new message contents by overwriting the indexed data in the cache. No compression is achieved in this case. 3. Non-matching TO index, Matching FROM Indexed Contents. In this case the message contents are compressed. 4. Non-matching TO index, Non-matching FROM Indexed Contents The cache in this case is updated by creating a new data entry in the cache indexed by the current messageÆs TO index. No compression is achieved in this case. ii) Multiple Header For compression of multiple header data the procedure is as follows. 1. Matching TO Index, Matching Contents In this case the cache position of the matching data is recorded, the message contents are blanked and the cache position inserted in their place in the compressed message. 2. Matching TO Index, Non-matching Contents If the TO indices match but the indexed data does not, then the cache is updated with the new message contents by appending the new data to the cache. The index remains equal to the matched TO index. No compression is achieved in this case. 3. Non-matching TO Index, Matching FROM Indexed Contents. In this case the position of the matching data is recorded, the message contents are blanked and the cache position inserted in their place. 4. Non-matching TO Index, Non-matching FROM Indexed Content The cache in this case is updated by appending the new data to the cache indexed by the FROM index. No compression is achieved. 2.3.4 Partial Compression In compression æpartialÆ compression of data for certain headers may also be performed. The data involved in partial data compression is data that always contain one of the request METHODS in every message, e.g. A Cseq header line has the following format Cseq: The request method within the Cseq data may be compressed/decompressed using the METHOD dictionary. 2.3.5 Header Grouping SIP Unlike multiple header data, non-multiple header data compression does not require the insertion of the cache index position in place of the contents being blanked. If this is the case and the header in question is a non-mandatory header then the compressed line will consist only of the compressed header followed by a colon. If there a number of message lines of this format then TCCB will compress these lines into a single line of the following format : : : etc. The decompressor will in response parse this line and insert each of the SIP headers as individual lines in the SIP part of the message. SDP SDP header grouping applies only to the following SDP headers m, b and a if present in the message. If the compressed message results in the following compressed SDP format of header and compressed message contents having been replaced with their cache index position m=2 b=2 a=3 a=5 m=1 b=3 a=4 a=1 Then the compressor can compress all of these lines into a single line of the form, m21=b23 a3541 Similarly, b=2 a=2 a=4 b=1 a=3 becomes b21=a243 Or simply a=1 a=4 a=3 becomes a=143 In each of the above cases the decompressor will parse the compressed line and insert each SDP header plus contents as individual lines in the SDP part of the message. 3.0 Examples: In order to understand the algorithm, a set of examples based on a registration and the first couple of messages of a session establishment are given. The following abbreviations are used throughout the examples: C: Client S: Server CT: Client TCCB CTC: Client TCCB Cache ST: Server TCCB STC: Server TCCB Cache The following is an overview of the examples: Example 1: First Registration after power up Example 2: First few messages in a session establishment Example 1: First Registration Step 1.1: REGISTER C->CT A REGISTER request is sent from the Client (UAC), to the TCCB layer of the Client (CT) REGISTER sip:444.333.222.111 SIP/2.0 To: "Alien Blaster" From: "Alien Blaster" Call-ID: -3693193161352840821@157.190.66.137 Cseq: 1 REGISTER Via: SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] Contact: sip:[5555::aaa:bbb:ccc:ddd] Expires: 3600 Content-Length: 0 Step 1.2: REGISTER CT->ST The SIP aware TCCB extracts the FROM and TO indices from the received message and uses the FROM index to check the cache. Each header is processed individually and the cache (CTC) checked for a match. If no match found, using the appropriate index the cache is updated with the latest information. If a match is found, that particular header contents are blanked. The preloaded dictionaries are also used by TCCB at this point to achieve compression. Following Method/Static header compression the REGISTER message is as follows: 00000110 sip:444.333.222.111 SIP/2.0 01100110: "Alien Blaster" 01010000: "Alien Blaster" 01000011: -3693193161352840821@157.190.66.137 01001011: 1 00000110 01101001: SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] 01000101: sip:[5555::aaa:bbb:ccc:ddd] 01001111: 3600 01001001: 0 As mandatory header compression is not possible in this message, after pre-cached and default dictionary contents compression, the final compressed REGISTER message is, 00000110 sip:444.333.222.111 01100110: "Alien Blaster" 01010000: "Alien Blaster" 01000011: -3693193161352840821@157.190.66.137 01001011: 1 00000110 01101001: [5555::aaa:bbb:ccc:ddd] 01000101: sip:[5555::aaa:bbb:ccc:ddd] 01001111: * 01001001: 0 CTC: The following are the set of tables stored in the UA after TCCB processing Table: Request Line Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost sip:444.333.222.111 Table: Via Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] Table: To Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost "Alien Blaster" Table: From Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost "Alien Blaster" Table: Call-ID Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost -3693193161352840821 @157.190.66.137 Table: Cseq Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 1 REGISTER Table: Contact Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost sip:[5555::aaa:bbb:ccc:ddd] Table: Expires Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 3600 Table: Content-Length Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 0 In this particular example, this was the first message after power-up but with the preloaded dictionaries approach we are still able to achieve a percentage compression. Step 1.3: REGISTER ST->S The SIP aware TCCB extracts the FROM and TO indices from the received message and uses the relevant index to check the cache. Each header is processed individually and the cache (CTS) checked for a match. If a received message element is "blanked", the contents are retrieved from either the cache or the relevant preloaded dictionaries, else using the appropriate index, the cache is updated with the latest contents. CTS: The following are the extract of tables stored in the Server after TCCB processing for this particular registration Table: Request Line Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost sip:444.333.222.111 Table: Via Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] Table: From Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost "Alien Blaster" Table: To Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost "Alien Blaster" Table: Call-ID Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost -3693193161352840821 @157.190.66.137 Table: Cseq Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 1 REGISTER Table: Contact Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost sip:[5555::aaa:bbb:ccc:ddd] Table: Expires Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 3600 Table: Content-Length Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 0 In this particular example, after populating the cache the reconstructed method shown below is forwarded up the stack to the Server SIP application. REGISTER sip:444.333.222.111 SIP/2.0 To: "Alien Blaster" From: "Alien Blaster" Call-ID: -3693193161352840821@157.190.66.137 Cseq: 1 REGISTER Via: SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] Contact: sip:[5555::aaa:bbb:ccc:ddd] Expires: 3600 Content-Length: 0 Step 1.4: 200 OK S->ST SIP/2.0 200 OK To: "Alien Blaster" From: "Alien Blaster" Cseq: 1 REGISTER Call-ID: -3693193161352840821@157.190.66.137 Via: SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] Content-Length: 0 Step 1.5: 200 OK ST->CT TCCB in the Server first extracts the FROM and TO indices from the REGISTER response message and uses the relevant indices to check the cache. Each header is processed in turn using the same set of rules as in Step 1.2. In this particular instance the cache will remain unchanged from Step 1.3 as no new message contents are encountered. Consequent to Method and Static header compression, the message becomes 00001110 01100110: "Alien Blaster" 01010000: "Alien Blaster" 01001011: 1 00001110 1000011: -3693193161352840821@157.190.66.137 1101001: SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] 01001001: 0 Following pre-cache and default content dictionary compression the message looks as follows 00001110 01100110: 01010000: 01001011: 01000011: 01101001: 01001001: Now elimination of mandatory headers that have their contents cached takes place. According to RFC3261, the following headers are mandatory for 200 OK Via, From, To, Call-ID, Cseq. Mandatory header elimination yields the final compressed 200 OK message, 00001110 01010000: Step 1.6: 200 OK CT->C The TCCB layer in the Client receives the compressed 200 OK and first decompresses the From and To line message contents. The From and To indices are then extracted. All mandatory headers that are not present are populated from the preloaded mandatory header dictionary along with the blanked header contents using the required indices. After processing, the reconstructed method is forwarded up the stack to the Client (UAC) SIP/2.0 200 OK To: "Alien Blaster" From: "Alien Blaster" Cseq: 1 REGISTER Call-ID: -3693193161352840821@157.190.66.137 Via: SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] Content-Length: 0 Note that the CTC remains as per Step 1.2 as no new message contents were encountered between the first and second messages. Example 2: Session Origination ûAlien Blaster decides to make a call Step 2.1: INVITE C->CT An INVITE is sent from the UAC to the TCCB adaptation layer in the Client. The following are the contents of the INVITE, with the SDP information. INVITE sip:+1-212-555-2222@home1.net;user=phone SIP/2.0 Via: SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] Supported: 100rel Remote-Party-ID: "John Doe" ;privacy=off Anonymity: Off From: "Alien Blaster" To: sip:B36(SHA-1(+1-212-555-2222;time=36123E5B;seq=73))@localhost Call-ID: cb03a0s09a2sdfglkj490333 Cseq: 127 INVITE Contact: sip:[5555::aaa:bbb:ccc:ddd] Content-Type: application/sdp Content-Length: 229 v=0 o=- 0 0 IN IP6 5555::aaa:bbb:ccc:ddd s=- c=IN IP6 5555::aaa:bbb:ccc:ddd t=0 0 m=audio 3456 RTP/AVP 97 96 0 15 b=AS:25.4 a=rtpmap:97 AMR a=fmtp:97 mode-set=0,2,5,7; maxframes=2 a=rtpmap:96 G726-32/8000 a=qos:mandatory sendrecv Step 2.2: INVITE CT->ST The SIP aware TCCB extracts the FROM and TO indices from the message and uses a combination of the indices to access the cache. As previously mentioned the From index provides the first level of indexing while the To offers a further level of indexing as required. Each header is processed individually and the cache (CTC) checked for a match. If no match is found, the cache is updated with the latest information. If a match is found, that particular header contents are blanked. CTC: The following are the set of tables stored in the UA after TCCB processing Table: Request Line Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost sip:444:333:222:111 sip:B36(SHA-1(+1-212-555- 2222;time=36123E5B;seq=73))@localhost sip:+1-212-555-2222 @home1.net;user=phone Table: Via Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] Table: To Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost sip:B36(SHA-1(+1-212-555- 222;time=36123E5B;seq=73)) @localhost sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost "Alien Blaster" Table: From Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost "Alien Blaster" Table: Call-ID Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost -3693193161352840821 @157.190.66.137 sip:B36(SHA-1(+1-212-555-222; time=36123E5B;seq=73))@localhost cb03a0s09a2sdfglkj490333 Table: Cseq Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 1 REGISTER sip:B36(SHA-1(+1-212-555-222;time=36123E5B; seq=73))@localhost 127 INVITE Table: Contact Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost sip:[5555::aaa:bbb:ccc:ddd] Table: Expires Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 3600 Table: Content-Type Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost application/sdp Table: Content-Length Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 0 sip:B36(SHA-1(+1-212-555-222;time=36123E5B; seq=73))@localhost 229 Table: Supported Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 100rel Table: Remote-Party-ID Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost ôJohn Doeö ;privacy=off Table: Anonymity Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost Off Table: v Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 0 Table: o Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost - 0 0 IN IP6 5555::aaa:bbb:ccc:ddd Table: s Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost - Table: c Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost IN IP6 5555::aaa:bbb:ccc:ddd Table: t Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 0 0 Table: m Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost audio 3456 RTP/AVP 97 96 0 15 Table: b Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost AS:25.4 Table:a Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost rtpmap:97 AMR sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost fmtp:97 mode-set= 0,2,5,7;maxframes=2 sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost rtpmap:96 G726-32/8000 sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost qos:mandatory sendrecv In this particular example, content matches were found for the Via and Contact headers. The cache was updated for all other fields, with the combination of From and To used to provide a further level of granularity where required. The message after method and static header compression is, 00000001 sip:+1-212-555-2222@home1.net;user=phone SIP/2.0 01101001: SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] 01100100: 100rel 01101101: "John Doe" ;privacy=off 01101100: Off 01010000: "Alien Blaster" 01100110: sip:B36(SHA-1(+1-212-555-2222;time=36123E5B;seq=73))@localhost 01000011: cb03a0s09a2sdfglkj490333 1001011: 127 00000001 01000101: sip:[5555::aaa:bbb:ccc:ddd] 01001010: application/sdp 01001001: 229 v=0 o=- 0 0 IN IP6 5555::aaa:bbb:ccc:ddd s=- c=IN IP6 5555::aaa:bbb:ccc:ddd t=0 0 m=audio 3456 RTP/AVP 97 96 0 15 b=AS:25.4 a=rtpmap:97 AMR a=fmtp:97 mode-set=0,2,5,7; maxframes=2 a=rtpmap:96 G726-32/8000 a=qos:mandatory sendrecv Next any pre-cached or default dictionary contents are blanked/compressed. In this particular message, cache and message contents matches were found for the Via and the Contact message lines. The cache was updated for all other fields as necessary, using a combination of the FROM and TO indices. 00000001 sip:+1-212-555-2222@home1.net;user=phone SIP/2.0 01101001: 01100100: * 01101101: "John Doe" ;privacy=off 01101100: Off 01010000: sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 01100110: sip:B36(SHA-1(+1-212-555-2222;time=36123E5B;seq=73))@localhost 01000011: cb03a0s09a2sdfglkj490333 1001011: 127 00000001 01000101: 01001010: * 01001001: 229 v=0 o=- 0 0 IN IP6 5555::aaa:bbb:ccc:ddd s=- c=IN IP6 5555::aaa:bbb:ccc:ddd t=* m=* 3456 RTP/AVP 97 96 0 15 b=AS:25.4 a=*097 AMR a=*197 mode-set=0,2,5,7; maxframes=2 a=*096 G726-32/8000 a=*2 Now elimination of mandatory headers that have had their contents blanked takes place. The following headers are mandatory for INVITE; Via, From, To, Call-ID, Cseq. After mandatory header elimination the final compressed INVITE will look as follows: 00000001 sip:+1-212-555-2222@home1.net;user=phone 01100100: * 01101101: "John Doe" ;privacy=off 01101100: Off 01010000: sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 01100110: sip:B36(SHA-1(+1-212-555-2222;time=36123E5B;seq=73))@localhost 01000011: cb03a0s09a2sdfglkj490333 01001011: 127 00000001 01000101: 01001010: * 01001001: 229 v=0 o=- 0 0 IN IP6 5555::aaa:bbb:ccc:ddd s=- c=IN IP6 5555::aaa:bbb:ccc:ddd t=* m=* 3456 RTP/AVP 97 96 0 15 b=AS:25.4 a=*097 AMR a=*197 mode-set=0,2,5,7; maxframes=2 a=*096 G726-32/8000 a=*2 Step 2.3: INVITE ST->S The SIP aware TCCB uses the From and To indices to access the cache. If a mandatory header is missing or a received header is "blanked", the contents are retrieved from the cache, else using the appropriate index the cache is updated with the latest contents. CTS: The following are the extract of tables stored in the Server after TCCB processing for this particular INVITE Table: Request Line Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost sip:444:333:222:111 sip:B36(SHA-1(+1-212-555- 2222;time=36123E5B;seq=73))@localhost sip:+1-212-555-2222 @home1.net;user=phone Table: Via Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] Table: To Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost sip:B36(SHA-1(+1-212-555- 222;time=36123E5B;seq=73)) @localhost sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost "Alien Blaster" Table: From Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost "Alien Blaster" Table: Call-ID Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost -3693193161352840821 @157.190.66.137 sip:B36(SHA-1(+1-212-555-222;time=36123E5B; seq=73))@localhost cb03a0s09a2sdfglkj490333 Table: Cseq Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 1 REGISTER sip:B36(SHA-1(+1-212-555-222;time=36123E5B; seq=73))@localhost 127 INVITE Table: Contact Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost sip:[5555::aaa:bbb:ccc:ddd] Table: Expires Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 3600 Table: Content-Type Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost application/sdp Table: Content-Length Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 0 sip:B36(SHA-1(+1-212-555-222;time=36123E5B; seq=73))@localhost 229 Table: Supported Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 100rel Table: Remote-Party-ID Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost ôJohn Doeö ;privacy=off Table: Anonymity Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost Off Table: v Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 0 Table: o Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost - 0 0 IN IP6 5555::aaa:bbb:ccc:ddd Table: s Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost - Table: c Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost IN IP6 5555::aaa:bbb:ccc:ddd Table: t Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 0 0 Table: m Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost audio 3456 RTP/AVP 97 96 0 15 Table: b Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost AS:25.4 Table:a Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost rtpmap:97 AMR sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost fmtp:97 mode-set= 0,2,5,7;maxframes=2 sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost rtpmap:96 G726-32/8000 sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost qos:mandatory sendrecv The only header i.e. the Via that was compressed is re-constructed using the mandatory header dictionary and the contents are retrieved from the cache. INVITE sip:+1-212-555-2222@home1.net;user=phone SIP/2.0 Via: SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] Supported: 100rel Remote-Party-ID: "John Doe" ;privacy=off Anonymity: Off From: "Alien Blaster" To: sip:B36(SHA-1(+1-212-555- 2222;time=36123E5B;seq=73))@localhost Call-ID: cb03a0s09a2sdfglkj490333 Cseq: 127 INVITE Contact: sip:[5555::aaa:bbb:ccc:ddd] Content-Type: application/sdp Content-Length: 229 v=0 o=- 0 0 IN IP6 5555::aaa:bbb:ccc:ddd s=- c=IN IP6 5555::aaa:bbb:ccc:ddd t=0 0 m=audio 3456 RTP/AVP 97 96 0 15 b=AS:25.4 a=rtpmap:97 AMR a=fmtp:97 mode-set=0,2,5,7; maxframes=2 a=rtpmap:96 G726-32/8000 a=qos:mandatory sendrecv Step 2.4: 100 TRYING S->ST SIP/2.0 100 TRYING To: sip:B36(SHA-1(+1-212-555-2222;time=36123E5B;seq=73))@localhost From: "Alien Blaster" Cseq: 127 INVITE Call-ID: cb03a0s09a2sdfglkj490333 Via: SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] Content-Length: 0 Step 2.5: 100 TRYING ST->CT TCCB in the Server uses the FROM and TO indices to access the information in the cache. Each header is processed individually. For this particular message cache and message contents matches were found for the To, From, Cseq, Call-ID and Via headers. 00001001 01100110: 01010000: 01001011: 01000011: 01101001: 01001001: 0 The following headers are mandatory for 1XX responses: Via, From, To, Call-ID, Cseq. After mandatory header elimination the final compressed 100 TRYING looks as follows: 00001001 01001001: 0 Note that CTS remains the same as in Step 2.3 Step 2.6: 100 TRYING CT->C TCCB in the Client uses the From and To indices to access the information in the cache. Each header is processed individually. The missing mandatory headers and blanked header fields are reconstructed. SIP/2.0 100 TRYING To: sip:B36(SHA-1(+1-212-555-2222;time=36123E5B;seq=73))@localhost From: "Alien Blaster" Cseq: 127 INVITE Call-ID: cb03a0s09a2sdfglkj490333 Via: SIP/2.0/UDP [5555::aaa:bbb:ccc:ddd] Content-Length: 0 Note that the only header cache table to change in the CTC at this point is the Content-Length table. Table: Content-Length Index Contents ----- -------- sip:B36(SHA-1(user1_public1@home1.net; time=36123E5B; seq=72))@localhost 0 sip:B36(SHA-1(+1-212-555-2222; time=36123E5B;seq=73))@localhost 0 4.0 Requirements 4.1 Impact on Protocols and Internet Infrastructure a) Transparency When a message is compressed and then decompressed, the resulting message must be bit-wise identical to the original message. 4.2 Coexisting with other schemes a) Header compression TCCB is flexible enough to be used in conjunction with any TCP/IP header compression schemes, in particular as defined by ROHC. For SIP messages it can be further used with any SIP aware compression schemes so defined. b) Encryption TCCB can also be used in conjunction with any end-to-end message encryption scheme and the compression ratio is unaffected by the encryption method used providing that TCCB compression and decompression is done at the same nodes as the encryption/decryption. 4.3 Robustness a) If a message is lost between a compressor and decompressor then the compression scheme still compresses/decompresses correctly and the only loss could be the efficiency of the compression of the messages. Retransmission of SIP messages will take place based on the timer and will be compressed and decompresses as before. TCCB in the adaptation layer is completely ignorant of the retransmission of SIP messages in the Application layer. b) Resilience against residual errors between compressor and decompressor: If the lower layers are unable to detect residual errors in messages, then the compression scheme will still be able to compress/decompress correctly c) Resilience against message mis-ordering between compressor and decompressor. TCCB in the adaptation layer will not be aware of the out of sequence messaging and will always be able to compress/decompress all messages which it receives either from an upper layer or from a lower layer. 4.4 Scalability a) Memory Scalability The scheme must be scalable to accommodate a range of compressor/decompressors with varying storage capabilities. A more capable compressor must be able to interoperate with a less capable decompressor and vice versa. b) Processing scalability The scheme must be capable to accommodate a range of compressors/decompressors with varying processing capabilities c) Compression scalability The scheme should allow one to use additional mechanisms and /or more advanced compression methods to boost the compression method 4.5 Compression efficiency a) Average Ratio Must provide the highest compression ratio under constraints that the above requirements are met. b) Compression efficiency should not be affected by handover The compression ratio would be the same as if handover had not occurred. 5.0 Evaluation Here are the sizes of messages before and after compression and their percentage compression achieved using TCCB[3]. Table 1: Comparison of message sizes (in Bytes) before and after compression Message Uncompressed Compressed % Compression Message Size Message Size Using TCCB _________________________________________________________________________ REGISTERED 400 323 19% 200 OK 327 4 99% INVITE 724 488 33% (With SDP) 100 TRYING 292 6 98% Table 2: Some more results for a variety of messages. Message Uncompressed Compressed % Compression Message Size Message Size Using TCCB _________________________________________________________________________ 183 SESSION PROGRESS 835 352 58% PRACK 558 113 80% 200 OK 287 82 71% COMET 534 117 78% RINGING 349 41 88% PRACK 358 105 71% 200 OK 545 30 94% ACK 302 15 95% 6.0 Conclusion In conclusion, TCCB is a robust and flexible compression mechanism based on the cache and blank approach. Note, the implementation of TCCB in the adaptation layer requires pre-loading of dictionaries, these dictionaries are inclusive in the TCCB adaptation layer. 7.0 Intellectual Property Rights Considerations Motorola has filed patent applications that might possibly have technical relations to the contribution. 8.0 References 1. J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, M. Handley, E. Schooler, SIP: Session Initiation Protocol. RFC 3261,June 2002 2. J. Rosenberg, H. Schulzrinne, An Offer/Answer Model with the Session Description Protocol (SDP). RFC3264, June 2002 3. V. Kenneally, D. Pesch, I. Majumdar, Evaluation of SIP Compression for IP based Wireless Multimedia Communications, September 2002 9.0 Author's Addresses Ishita Majumdar Motorola 1501 W. Shure Drive Arlington Heights, IL 60004 Phone: 847-435-2067 Email: Ishita.Majumdar@motorola.com Valerie Kenneally Electronic Eng. Dept., Cork Institute of Technology, Rossa Ave., Bishopstown, Cork, Ireland Email:vkenneally@cit.ie Dirk Pesch Electronic Eng. Dept., Cork Institute of Technology, Rossa Ave., Bishopstown, Cork, Ireland Email:dpesch@cit.ie Full Copyright Statement "Copyright (C) The Internet Society (date). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into