Internet Engineering Task Force ROHC/SIP WG Internet Draft Rosenberg draft-rosenberg-rohc-sip-udpcomp-00.txt dynamicsoft July 13, 2001 Expires: February 2002 Compression of SIP STATUS OF THIS MEMO This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt To view the list Internet-Draft Shadow Directories, see http://www.ietf.org/shadow.html. Abstract The Session Initiation Protocol (SIP), along with many other IP protocols used for multimedia communications, are UDP-based, textual protocols engineered for bandwidth rich links. As a result, these messages have not been optimized for message size. With the planned usage of these protocols in wireless handsets as part of 2.5G and 3G wireless, the large size of these messages is problematic, primarily for latency reasons. As a result, we propose a protocol shim between SIP (or any other UDP based protocol meeting our requirements) and UDP in order to efficiently compress SIP messages. Preliminary results have shown compression gains of up to 8:1. 1 Introduction The Session Initiation Protocol (SIP) [1], along with many other IP Rosenberg [Page 1] Internet Draft udpcomp July 13, 2001 protocols used for multimedia communications (such as RTSP [2] and MGCP [3]), are UDP-based, textual protocols engineered for bandwidth rich links. As a result, these messages have not been optimized for message size. Typical SIP messages are from a few hundred bytes to as high as 2000. To date, this has not been a significant problem. The media established by SIP (usually carried over RTP [4] uses a much higher percentage of the overall link bandwidth to the host. Therefore, Amdahls law tells us that SIP compression is not as important as RTP compression, which is already handled by mechanisms like CRTP [5] and ROHC [6]. With the planned usage of these protocols in wireless handsets as part of 2.5G and 3G wireless, the large size of these messages is problematic. The problem is not bandwidth efficiency (the media stream still consumes the majority of the bandwidth), but rather latency. With low-rate IP connectivity, store-and-forward delays are significant. For CDMA2000, for example, the basic channel will support only 9.6 kbps. At this rate, transmission of each byte requires 0.8ms. A 500 byte SIP message requires nearly half a second to transmit. Taking into account retransmits, and the multiplicity of messages that are required in some flows, call setup and feature invocation are adversely affected. Therefore, we believe there is merit in investigating improvements in message sizes. The remainder of this document is outlined as follows. We first present our basic design approach and goals, comparing it to the existing work in this area. We then describe the protocol, and describe its message formats. Some preliminary results using a modified LZW algorithm are described. 2 Design Approach Achieving message size reduction can be done in many ways. Some have proposed to define binary encodings for SIP. We think this is a tragic mistake, since it introduces terrible interoperability problems, and pushes the burden of wireless handsets to all SIP devices. Others have proposed specialty compression algorithms that are SIP specific. We also believe this is a mistake. As SIP evolves and changes, or new protocols emerge, these mechanisms may no longer function. As a result, we strongly believe that the only acceptable solution is to perform compression at lower layers, and this has been agreed upon by the SIP working group as part of its guidelines [7]. Our proposal is to define a protocol shim between SIP and UDP. Effectively, this can be seen as an alternate transport, on par with UDP, TCP, and SCTP, even though it runs over UDP. This approach (as opposed to defining a new transport, or doing link-layer compression), has many advantages: Rosenberg [Page 2] Internet Draft udpcomp July 13, 2001 o It allows mechanisms like ROHC to be applied to the IP/UDP headers to compress them as well. o It allows SIP's transport negotiation mechanisms, based on SRV records [8] to be applied. This brings backwards compatibility, and means that our transport can be used anywhere between two SIP entities. o It doesn't rely on any underlying link layers, and therefore can be used anywhere where IP connectivity exists. Because the mechanism is a shim between application layer protocols, like SIP, and UDP, its important that it be lossless, and work with a large class of protocols. Indeed, it should not even rely on textual encoding, but work with binary protocols as well. We also believe that it should function without enforcing ordered delivery of messages. Many protocols that use UDP choose it since it doesn't require ordered delivery. If this is broken by the protocol shim, it will need to reorder messages and therefore introduce significant latencies. These statements are all consistent with the requirements outlined by Hannu et al. [9] for signaling layer compression. Indeed, our protocol is similar to the ROGER protocol proposed by Hannu el al. [10] and the SCRIBE protocol described by Liu et al. [11]. Our focus is on codebook management, which differs significantly from SCRIBE and ROGER. Hannu points out that the best way to handle signaling layer protocol compression is to generate a codebook based on the previous messages sent and received. Hannu proposes that this can be done by appending entire messages to the codebook, or just strings from prior messages. ROGER uses a codebook equal to the set of prior received messages, and then runs LZSS over them. While this is likely to yield excellent compression results, it does so at the expense of a fairly large codebook. SCRIBE attemps to fix this by allowing the encoder to delete specific portions of the codebook. While this is a good start, we think there is a strong need to be able to handle codebook management with unlimited flexibility. We believe that devices may need to significantly restrict the codebook size because of memory limitations. Once that is done, the trick in compression is the selection of the codevectors in the codebook. We have observed that there are many, many ways to choose which elements constitute the codebook. This choice significantly affects the performance of the algorithm. Once a mechanism is chosen for determining what goes into the codebook, the process of compression is to choose codebook vectors, and send them. Rosenberg [Page 3] Internet Draft udpcomp July 13, 2001 For this reason, we propose that the problem of message compression be separated into two problems - a PROTOCOL for reliably transmitting codebook entries and messages using that codebook, and an ALGORITHM for deciding which elements get placed into the codebooks, and which codevectors are used to construct a message. Given a general purpose protocol, the choice of algorithm is a local implementation decision by the encoder. This results in significant flexibility. Manufacturers of devices can differentiate themselves based on the effectiveness of their compression mechanisms. Devices with small memory footprints, but significant processing power, can develop compression techniques effective for such links. Devices with limited processing power but lots of memory can utilize different mechanisms. 3 Protocol Operation The operation of the protocol is fairly straightforward. It assumes that there is a sender A and receiver B. A compression context is associated with a series of compressed messages from A (identified by its IP address and port) to B (also identified by its IP address and port). There is no relationship between compression contexts from A to B and B to A; that is, contexts are unidirectional. We do this, in part, because SIP does not send responses back to the source port of the request, which would complicate context identification. When a message is to be sent, the encoder looks up the source IP/port and destination IP/port, and obtains a list of valid contexts. It then chooses a context. That choice can be made in many ways. We assume that the application will provide an indication of the appropriate context to the compressed-UDP transmission layer, since the application is in the best position to make a determination on context. If no context exists, one is created. It is established with any empty codebook (usage of static codebooks is possible as well, to support unidirectional links, but is not discussed here). The encoder, through the protocol, is able to request that the decoder write entries into the codebook at specified indices, and also use that data to construct a piece of the decompressed message. The encoder can also instruct the decoder to contruct a piece of the decompressed message by reading from specified indices. The encoder is able to overwrite entries that already exist. This allows it to use a codebook of whatever size it chooses. It is possible to even negotiate an agreed upon size between encoder and decoder as part of context initialization, but that is not specified here. The encoded message contains that compression context and a message sequence number. What follows are a series of tokens that comprise the message. The tokens are effecively directions for the decoder, Rosenberg [Page 4] Internet Draft udpcomp July 13, 2001 with instructions on management of the codebook and construction of the decompressed message. There are several token types: READ: Read the entry in the codebook at the specified index, and append that to the decompressed message. WRITE: Take the text which follows, append it to the decompressed message, and also write it into the codebook at the specified index. WRITE-REFERENCED: Read N entries from the codebook at the following N indices, append that text together, and write that into the decompressed message. Add that text to the codebook at the specified index. The encoder and decoder both contain the codebook. At the encoder, each codebook entry at a given index contains not only some decompressed text, but also a sequence number and a clean bit. Whenever a write is performed into the codebook by the encoder, the clean bit is set to false, and the sequence number is set to the sequence number of the message. That particular codebook entry can not be used by the encoder for messages with higher sequence numbers until the encoder receives an acknowledgement that the message has been received at the decoder. However, the encoder MAY WRITE into a codebook entry marked dirty (it can't WRITE-REFERENCED referencing an index that is dirty, though). When it does this, the clean bit continues to indicate dirty, but the sequence number gets updated to that of the current message. Rather than building in a mechanism for message ackowledgement, as done by ROGER and SCRIBE, we recognize that most application layer protocols based on UDP already incorporate their own mechanisms for reliability. As such, when the encoder receives confirmation in some way, that a particular message has been received, it indicates the UDP compression layer that this message was processed. At that point, the encoder finds all codebook entries with the confirmed sequence number, and sets their clean bits to indicate clean. At the decoder, the codebook contains just the text entries. The decoder also maintains a cache of codebook writes, and the value of the highest sequence number for which all sequence numbers below, and including, that number, have been received. Call this the "top counter". When a message is received, it is immediately decoded. Any codebook writes requested from the message are placed into the cache. The top counter is then updated. The decoder then scans the cache, and removes any entries with sequence numbers lower than, or equal to, the top counter. The writes are then applied to the codebook, in sequence number order. Rosenberg [Page 5] Internet Draft udpcomp July 13, 2001 It can be shown that with these combinations of rules, messages can be compressed and decompressed without requiring in-order delivery at the receiver. Furthermore, received messages do not need to be acknowledged before the next one is sent. Significant latencies, or significant out-of-order delivery, will result in a large number of entries marked direty at the encoder. This will result in compression inefficiencies. In the worst case, where the entire encoder codebook is dirty, the encoder can send every message as a sequence of WRITES, which results in no compression at all. Consider an example. A is the encoder, and B is the decoder. They both have 100 entries in the codebook, and all entries are clean at the encoder. A sends message 1 to B, and does so by writing entries 1-10 into the codebook. Entries 21-100 already have values which are read as part of the compression process, but not written. A then sends another message, message 2, to B. It cannot read codebook entries 1-10, but can read 21-100. So, it writes into codebook entries 11-30, and makes use of reads from entries 61-100. Message 2 arrives at the decoder first. Since message 2 does not rely on any codebook entries written by message 1, the decoder is capable of decoding it and passing it up to the higher layers. The writes for entries 11-30 are placed into the cache. The top counter is zero, so those writes are not applied. Message 1 then arrives. It makes use of data in the message, and data read from entries 21-100. Since entries 21-30 have not yet been updated by message 2, the contents of these entries are correct for decoding message 1. The writes for entries 11-30 are placed into the cache. The top counter is now updated to 2. So, the decoder extracts all of the entries from the cache, and applies them in order. 4 Message Details Each message has the following format: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Context | SN | CRC-high | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CRC-low | tokens...... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Context is an 8 bit context identifier. SN is a 16 bit sequence number. It begins at 0. The sequence number wraps around when it hits 65535. Sequence number comparisons MUST take into account this rollover. The CRC is a 16 bit CRC over the message before encoding. This is used to verify that the decode process was successful (format Rosenberg [Page 6] Internet Draft udpcomp July 13, 2001 of the CRC is TBD). What follows are a sequence of tokens. There are 6 types of tokens: 2 READ, 2 WRITE, and 2 WRITE-REFERENCE. For each operation, the types differ based on which codebook entry is being read or written. Codebook entries from 0-127 are accessed with one type, and entries 128-4223 with another type. 4.1 READ LOW The READ LOW token reads a codebook entry from index 0 to 127. Its format is: +-+-+-+-+-+-+-+-+ |0| index | +-+-+-+-+-+-+-+-+ index is a 7 bit index into the codebook. 4.2 READ HIGH The READ HIGH token reads a codebook entry from index 127 to 4223. Its format is: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| index |0|0|0| index | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The two bytes provide 12 bits of index. The codebook entry is equal to 128 plus the value of the index. 4.3 WRITE The WRITE token writes a codebook entry whose index is from 0 to 127. The text to be written follows, and consists of a series of bytes with a given length. The token has a minimum of two bytes. When the value of length1 is a zero, another byte follows (length2). The value of length2 is the actual length of the text. This allows for smaller text to be written using a 2-byte token overhead, while larger text requires three bytes. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Rosenberg [Page 7] Internet Draft udpcomp July 13, 2001 |1| index |1|1|1| length1 | entry... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ or +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| index |1|1|1|0|0|0|0|0| length2 | entry... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4.4 WRITE HIGH The WRITE HIGH token writes a codebook entry whose index is from 128 to 4223. The text to be written follows, and consists of a series of bytes with a given length. The token is always three bytes long. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| index |0|1|1| index | length | entry... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4.5 WRITE REFERENCE The WRITE REFERENCE token writes a codebook entry whose index is from 0 to 127. Rather than the text being included, as in the WRITE and WRITE HIGH tokens, the text to place into the codebook is read as a series of existing entries in the codebook. Those entries are concatenated together to from the text that is written. The token consists of an index to write into, followed by a counter of tokens (TC). The counter indicates how many tokens follow. Since the token counter is 5 bits, only 32 tokens can be used. After the TC field are a series of READ or READ HIGH tokens that identify the data to read from the codebook. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| index |1|1|0| TC | tokens.. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4.6 WRITE HIGH REFERENCE The WRITE HIGH REFERENCE is identical to the WRITE REFERENCE token, Rosenberg [Page 8] Internet Draft udpcomp July 13, 2001 except that it writes entries into codebook indices from 128 to 4233. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| index |0|1|1| index |x|x|x| TC | tokens +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 5 Preliminary Results To validate our approach, we developed an encoding algorithm that can make use of the codebook protocol specified here. The algorithm is a variant on LZW. It uses information about the correlation of characters in previous messages to determine if a new character sequence should be added to the codebook. It is not as efficient as LZSS with a large window, which we believe we be an ideal choice for SIP messages. However, unlike LZSS with a nicely sized window, which can be computationally complex, our algorithm has a fairly fast run time (its O(N) with message size N). The protocol assumes that uncompressed messages are composed of textual parameters separated by well-defined delimeters. For SIP, these delimeters are ; : , @ < > CRLF SP. The encoder breaks the message into pieces, where each piece contains no delimiters, or all delimeters. These pieces are the finest granularity of elements placed into the codebook (i.e., they are the atomic characters of the input stream). Once broken into pieces, the encoder finds the longest entry in the codebook which matches the top pieces of the message (for example, if the message is composed of pieces A, B, and C, and the codebook has entries A, B, and AB, the encoder would find AB). If a match is not found, the text for the top piece is added to the codebook and sent. If a match is found, the encoder finds the next longest match. If it discovers that these two matches have occurred previously in series in a message, it writes the combination into the codebook using the WRITE REFERENCE tokens. This requires the encoder to remember, for each entry in the codebook, which entries have occured prior to it in past messages. To speed up the longest match operation, the encoder uses a hash table of "hints". If a particular piece is in the hash table, it means that there exists a codebook entry with that piece as a prefix. As a result, to find the longest match, the encoder takes the top piece, looks it up in the codebook, and notes whether there is a match. If the piece is found in the "hints" hash table, the next piece is appended, and the process repeats until the lookup fails in the hints hash table. The codebook itself is an LRU data store. Entries not recently used are removed from the codebook. Rosenberg [Page 9] Internet Draft udpcomp July 13, 2001 This simple algorithm gradually creates codebook entries that are the longest substrings typically found in messages. These frequently recurring substrings stay in the codebook, while message components that are never reused eventually fall to the bottom and are removed. We ran our encoder against a "friendly" test, which was a series of five INVITE requests. Each request is identical to the previous, except that each has a different Call-ID, To URL and display name, Request URI, and RTP port number. The size of the uncompressed, and then compressed, versions of these packets are shown in Table 1. Message Uncomressed size Compressed Size ___________________________________________ 1 405 435 2 401 158 3 401 48 4 401 48 5 401 48 Table 1: Compression Results The results are pretty good; the algorithm quickly converges upon an 8.3:1 compression ratio. 6 Future Work The primary task is to define additional tokens that can enable a wider variety of compression algorithms. For example, if we add a token that allows to read N consecutive codebook entries starting from a specified index, we should be able to use our protocol to support the LZSS algorithm used by Hannu [10]. If a token is defined which allows the codebook to be written without using the text as part of the decompressed message, the exact LZW algorithm can be used, rather than our modified version. 7 Security Considerations To our knowledge, this protocol does not introduce any additional security considerations. However, this merits further investigation. 8 Authors Addresses Jonathan Rosenberg dynamicsoft 72 Eagle Rock Avenue Rosenberg [Page 10] Internet Draft udpcomp July 13, 2001 First Floor East Hanover, NJ 07936 email: jdrosen@dynamicsoft.com 9 Bibliography [1] M. Handley, H. Schulzrinne, E. Schooler, and J. Rosenberg, "SIP: session initiation protocol," Request for Comments 2543, Internet Engineering Task Force, Mar. 1999. [2] H. Schulzrinne, A. Rao, and R. Lanphier, "Real time streaming protocol (RTSP)," Request for Comments 2326, Internet Engineering Task Force, Apr. 1998. [3] M. Arango, A. Dugan, I. Elliott, C. Huitema, and S. Pickett, "Media gateway control protocol (MGCP) version 1.0," Request for Comments 2705, Internet Engineering Task Force, Oct. 1999. [4] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: a transport protocol for real-time applications," Request for Comments 1889, Internet Engineering Task Force, Jan. 1996. [5] S. Casner and V. Jacobson, "Compressing IP/UDP/RTP headers for low-speed serial links," Request for Comments 2508, Internet Engineering Task Force, Feb. 1999. [6] C. Bormann et al. , "RObust header compression (ROHC)," Internet Draft, Internet Engineering Task Force, Feb. 2001. Work in progress. [7] J. Rosenberg and H. Schulzrinne, "Guidelines for authors of SIP extensions," Internet Draft, Internet Engineering Task Force, Mar. 2001. Work in progress. [8] H. Schulzrinne and J. Rosenberg, "SIP: Session initiation protocol -- locating SIP servers," Internet Draft, Internet Engineering Task Force, Mar. 2001. Work in progress. [9] H. Hannu, J. Christoffersson, and K. Svanbro, "Application signaling over cellular links," Internet Draft, Internet Engineering Task Force, Mar. 2001. Work in progress. [10] H. Hannu, J. Christoffersson, and K. Svanbro, "RObust GEneric message size reduction (ROGER)," Internet Draft, Internet Engineering Task Force, Feb. 2001. Work in progress. Rosenberg [Page 11] Internet Draft udpcomp July 13, 2001 [11] Z. Liu, K. Le, K. Leung, and C. Clanton, "Scalable robust efficient dictionary-based compression (SCRIBE)," Internet Draft, Internet Engineering Task Force, Mar. 2001. Work in progress. Rosenberg [Page 12]