Internet Engineering Task Force                              ROHC/SIP WG
Internet Draft                                                 Rosenberg
draft-rosenberg-rohc-sip-udpcomp-00.txt                      dynamicsoft
July 13, 2001
Expires: February 2002


                           Compression of SIP

STATUS OF THIS MEMO

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress".

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   To view the list Internet-Draft Shadow Directories, see
   http://www.ietf.org/shadow.html.


Abstract

   The Session Initiation Protocol (SIP), along with many other IP
   protocols used for multimedia communications, are UDP-based, textual
   protocols engineered for bandwidth rich links. As a result, these
   messages have not been optimized for message size. With the planned
   usage of these protocols in wireless handsets as part of 2.5G and 3G
   wireless, the large size of these messages is problematic, primarily
   for latency reasons. As a result, we propose a protocol shim between
   SIP (or any other UDP based protocol meeting our requirements) and
   UDP in order to efficiently compress SIP messages. Preliminary
   results have shown compression gains of up to 8:1.


1 Introduction

   The Session Initiation Protocol (SIP) [1], along with many other IP


Rosenberg                                                     [Page 1]

Internet Draft                  udpcomp                    July 13, 2001


   protocols used for multimedia communications (such as RTSP [2] and
   MGCP [3]), are UDP-based, textual protocols engineered for bandwidth
   rich links. As a result, these messages have not been optimized for
   message size. Typical SIP messages are from a few hundred bytes to as
   high as 2000. To date, this has not been a significant problem. The
   media established by SIP (usually carried over RTP [4] uses a much
   higher percentage of the overall link bandwidth to the host.
   Therefore, Amdahls law tells us that SIP compression is not as
   important as RTP compression, which is already handled by mechanisms
   like CRTP [5] and ROHC [6].

   With the planned usage of these protocols in wireless handsets as
   part of 2.5G and 3G wireless, the large size of these messages is
   problematic. The problem is not bandwidth efficiency (the media
   stream still consumes the majority of the bandwidth), but rather
   latency. With low-rate IP connectivity, store-and-forward delays are
   significant. For CDMA2000, for example, the basic channel will
   support only 9.6 kbps. At this rate, transmission of each byte
   requires 0.8ms. A 500 byte SIP message requires nearly half a second
   to transmit. Taking into account retransmits, and the multiplicity of
   messages that are required in some flows, call setup and feature
   invocation are adversely affected. Therefore, we believe there is
   merit in investigating improvements in message sizes.

   The remainder of this document is outlined as follows. We first
   present our basic design approach and goals, comparing it to the
   existing work in this area. We then describe the protocol, and
   describe its message formats. Some preliminary results using a
   modified LZW algorithm are described.

2 Design Approach

   Achieving message size reduction can be done in many ways. Some have
   proposed to define binary encodings for SIP. We think this is a
   tragic mistake, since it introduces terrible interoperability
   problems, and pushes the burden of wireless handsets to all SIP
   devices. Others have proposed specialty compression algorithms that
   are SIP specific. We also believe this is a mistake. As SIP evolves
   and changes, or new protocols emerge, these mechanisms may no longer
   function. As a result, we strongly believe that the only acceptable
   solution is to perform compression at lower layers, and this has been
   agreed upon by the SIP working group as part of its guidelines [7].

   Our proposal is to define a protocol shim between SIP and UDP.
   Effectively, this can be seen as an alternate transport, on par with
   UDP, TCP, and SCTP, even though it runs over UDP. This approach (as
   opposed to defining a new transport, or doing link-layer
   compression), has many advantages:


Rosenberg                                                     [Page 2]

Internet Draft                  udpcomp                    July 13, 2001


        o It allows mechanisms like ROHC to be applied to the IP/UDP
          headers to compress them as well.

        o It allows SIP's transport negotiation mechanisms, based on SRV
          records [8] to be applied. This brings backwards
          compatibility, and means that our transport can be used
          anywhere between two SIP entities.

        o It doesn't rely on any underlying link layers, and therefore
          can be used anywhere where IP connectivity exists.

   Because the mechanism is a shim between application layer protocols,
   like SIP, and UDP, its important that it be lossless, and work with a
   large class of protocols. Indeed, it should not even rely on textual
   encoding, but work with binary protocols as well.

   We also believe that it should function without enforcing ordered
   delivery of messages. Many protocols that use UDP choose it since it
   doesn't require ordered delivery. If this is broken by the protocol
   shim, it will need to reorder messages and therefore introduce
   significant latencies.

   These statements are all consistent with the requirements outlined by
   Hannu et al. [9] for signaling layer compression. Indeed, our
   protocol is similar to the ROGER protocol proposed by Hannu el al.
   [10] and the SCRIBE protocol described by Liu et al. [11]. Our focus
   is on codebook management, which differs significantly from SCRIBE
   and ROGER.

   Hannu points out that the best way to handle signaling layer protocol
   compression is to generate a codebook based on the previous messages
   sent and received. Hannu proposes that this can be done by appending
   entire messages to the codebook, or just strings from prior messages.
   ROGER uses a codebook equal to the set of prior received messages,
   and then runs LZSS over them. While this is likely to yield excellent
   compression results, it does so at the expense of a fairly large
   codebook. SCRIBE attemps to fix this by allowing the encoder to
   delete specific portions of the codebook. While this is a good start,
   we think there is a strong need to be able to handle codebook
   management with unlimited flexibility. We believe that devices may
   need to significantly restrict the codebook size because of memory
   limitations. Once that is done, the trick in compression is the
   selection of the codevectors in the codebook. We have observed that
   there are many, many ways to choose which elements constitute the
   codebook. This choice significantly affects the performance of the
   algorithm. Once a mechanism is chosen for determining what goes into
   the codebook, the process of compression is to choose codebook
   vectors, and send them.


Rosenberg                                                     [Page 3]

Internet Draft                  udpcomp                    July 13, 2001


   For this reason, we propose that the problem of message compression
   be separated into two problems - a PROTOCOL for reliably transmitting
   codebook entries and messages using that codebook, and an ALGORITHM
   for deciding which elements get placed into the codebooks, and which
   codevectors are used to construct a message. Given a general purpose
   protocol, the choice of algorithm is a local implementation decision
   by the encoder. This results in significant flexibility.
   Manufacturers of devices can differentiate themselves based on the
   effectiveness of their compression mechanisms. Devices with small
   memory footprints, but significant processing power, can develop
   compression techniques effective for such links. Devices with limited
   processing power but lots of memory can utilize different mechanisms.

3 Protocol Operation

   The operation of the protocol is fairly straightforward. It assumes
   that there is a sender A and receiver B. A compression context is
   associated with a series of compressed messages from A (identified by
   its IP address and port) to B (also identified by its IP address and
   port). There is no relationship between compression contexts from A
   to B and B to A; that is, contexts are unidirectional. We do this, in
   part, because SIP does not send responses back to the source port of
   the request, which would complicate context identification.

   When a message is to be sent, the encoder looks up the source IP/port
   and destination IP/port, and obtains a list of valid contexts. It
   then chooses a context. That choice can be made in many ways. We
   assume that the application will provide an indication of the
   appropriate context to the compressed-UDP transmission layer, since
   the application is in the best position to make a determination on
   context.

   If no context exists, one is created. It is established with any
   empty codebook (usage of static codebooks is possible as well, to
   support unidirectional links, but is not discussed here). The
   encoder, through the protocol, is able to request that the decoder
   write entries into the codebook at specified indices, and also use
   that data to construct a piece of the decompressed message. The
   encoder can also instruct the decoder to contruct a piece of the
   decompressed message by reading from specified indices. The encoder
   is able to overwrite entries that already exist. This allows it to
   use a codebook of whatever size it chooses. It is possible to even
   negotiate an agreed upon size between encoder and decoder as part of
   context initialization, but that is not specified here.

   The encoded message contains that compression context and a message
   sequence number. What follows are a series of tokens that comprise
   the message. The tokens are effecively directions for the decoder,


Rosenberg                                                     [Page 4]

Internet Draft                  udpcomp                    July 13, 2001


   with instructions on management of the codebook and construction of
   the decompressed message. There are several token types:

        READ: Read the entry in the codebook at the specified index, and
             append that to the decompressed message.

        WRITE: Take the text which follows, append it to the
             decompressed message, and also write it into the codebook
             at the specified index.

        WRITE-REFERENCED: Read N entries from the codebook at the
             following N indices, append that text together, and write
             that into the decompressed message. Add that text to the
             codebook at the specified index.

   The encoder and decoder both contain the codebook. At the encoder,
   each codebook entry at a given index contains not only some
   decompressed text, but also a sequence number and a clean bit.
   Whenever a write is performed into the codebook by the encoder, the
   clean bit is set to false, and the sequence number is set to the
   sequence number of the message. That particular codebook entry can
   not be used by the encoder for messages with higher sequence numbers
   until the encoder receives an acknowledgement that the message has
   been received at the decoder. However, the encoder MAY WRITE into a
   codebook entry marked dirty (it can't WRITE-REFERENCED referencing an
   index that is dirty, though). When it does this, the clean bit
   continues to indicate dirty, but the sequence number gets updated to
   that of the current message.

   Rather than building in a mechanism for message ackowledgement, as
   done by ROGER and SCRIBE, we recognize that most application layer
   protocols based on UDP already incorporate their own mechanisms for
   reliability. As such, when the encoder receives confirmation in some
   way, that a particular message has been received, it indicates the
   UDP compression layer that this message was processed. At that point,
   the encoder finds all codebook entries with the confirmed sequence
   number, and sets their clean bits to indicate clean.

   At the decoder, the codebook contains just the text entries. The
   decoder also maintains a cache of codebook writes, and the value of
   the highest sequence number for which all sequence numbers below, and
   including, that number, have been received. Call this the "top
   counter". When a message is received, it is immediately decoded. Any
   codebook writes requested from the message are placed into the cache.
   The top counter is then updated. The decoder then scans the cache,
   and removes any entries with sequence numbers lower than, or equal
   to, the top counter. The writes are then applied to the codebook, in
   sequence number order.


Rosenberg                                                     [Page 5]

Internet Draft                  udpcomp                    July 13, 2001


   It can be shown that with these combinations of rules, messages can
   be compressed and decompressed without requiring in-order delivery at
   the receiver. Furthermore, received messages do not need to be
   acknowledged before the next one is sent. Significant latencies, or
   significant out-of-order delivery, will result in a large number of
   entries marked direty at the encoder. This will result in compression
   inefficiencies. In the worst case, where the entire encoder codebook
   is dirty, the encoder can send every message as a sequence of WRITES,
   which results in no compression at all.

   Consider an example. A is the encoder, and B is the decoder. They
   both have 100 entries in the codebook, and all entries are clean at
   the encoder. A sends message 1 to B, and does so by writing entries
   1-10 into the codebook. Entries 21-100 already have values which are
   read as part of the compression process, but not written. A then
   sends another message, message 2, to B. It cannot read codebook
   entries 1-10, but can read 21-100. So, it writes into codebook
   entries 11-30, and makes use of reads from entries 61-100. Message 2
   arrives at the decoder first. Since message 2 does not rely on any
   codebook entries written by message 1, the decoder is capable of
   decoding it and passing it up to the higher layers. The writes for
   entries 11-30 are placed into the cache. The top counter is zero, so
   those writes are not applied. Message 1 then arrives. It makes use of
   data in the message, and data read from entries 21-100. Since entries
   21-30 have not yet been updated by message 2, the contents of these
   entries are correct for decoding message 1. The writes for entries
   11-30 are placed into the cache. The top counter is now updated to 2.
   So, the decoder extracts all of the entries from the cache, and
   applies them in order.

4 Message Details

   Each message has the following format:


   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Context        |             SN                |    CRC-high   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  CRC-low      |   tokens......
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   Context is an 8 bit context identifier. SN is a 16 bit sequence
   number. It begins at 0. The sequence number wraps around when it hits
   65535. Sequence number comparisons MUST take into account this
   rollover. The CRC is a 16 bit CRC over the message before encoding.
   This is used to verify that the decode process was successful (format


Rosenberg                                                     [Page 6]

Internet Draft                  udpcomp                    July 13, 2001


   of the CRC is TBD). What follows are a sequence of tokens.

   There are 6 types of tokens: 2 READ, 2 WRITE, and 2 WRITE-REFERENCE.
   For each operation, the types differ based on which codebook entry is
   being read or written. Codebook entries from 0-127 are accessed with
   one type, and entries 128-4223 with another type.

4.1 READ LOW

   The READ LOW token reads a codebook entry from index 0 to 127. Its
   format is:


   +-+-+-+-+-+-+-+-+
   |0|   index     |
   +-+-+-+-+-+-+-+-+


   index is a 7 bit index into the codebook.

4.2 READ HIGH

   The READ HIGH token reads a codebook entry from index 127 to 4223.
   Its format is:


   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1|   index     |0|0|0|  index  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   The two bytes provide 12 bits of index. The codebook entry is equal
   to 128 plus the value of the index.

4.3 WRITE

   The WRITE token writes a codebook entry whose index is from 0 to 127.
   The text to be written follows, and consists of a series of bytes
   with a given length. The token has a minimum of two bytes. When the
   value of length1 is a zero, another byte follows (length2). The value
   of length2 is the actual length of the text. This allows for smaller
   text to be written using a 2-byte token overhead, while larger text
   requires three bytes.


   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


Rosenberg                                                     [Page 7]

Internet Draft                  udpcomp                    July 13, 2001


   |1|   index     |1|1|1| length1 |  entry...
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   or

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1|   index     |1|1|1|0|0|0|0|0|   length2     |  entry...
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


4.4 WRITE HIGH

   The WRITE HIGH token writes a codebook entry whose index is from 128
   to 4223. The text to be written follows, and consists of a series of
   bytes with a given length. The token is always three bytes long.


   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1|   index     |0|1|1| index   |   length      |  entry...
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


4.5 WRITE REFERENCE

   The WRITE REFERENCE token writes a codebook entry whose index is from
   0 to 127. Rather than the text being included, as in the WRITE and
   WRITE HIGH tokens, the text to place into the codebook is read as a
   series of existing entries in the codebook. Those entries are
   concatenated together to from the text that is written. The token
   consists of an index to write into, followed by a counter of tokens
   (TC). The counter indicates how many tokens follow. Since the token
   counter is 5 bits, only 32 tokens can be used. After the TC field are
   a series of READ or READ HIGH tokens that identify the data to read
   from the codebook.


   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1|   index     |1|1|0|   TC    | tokens..
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


4.6 WRITE HIGH REFERENCE

   The WRITE HIGH REFERENCE is identical to the WRITE REFERENCE token,


Rosenberg                                                     [Page 8]

Internet Draft                  udpcomp                    July 13, 2001


   except that it writes entries into codebook indices from 128 to 4233.


   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1|   index     |0|1|1| index   |x|x|x|   TC    |  tokens
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


5 Preliminary Results

   To validate our approach, we developed an encoding algorithm that can
   make use of the codebook protocol specified here. The algorithm is a
   variant on LZW. It uses information about the correlation of
   characters in previous messages to determine if a new character
   sequence should be added to the codebook. It is not as efficient as
   LZSS with a large window, which we believe we be an ideal choice for
   SIP messages. However, unlike LZSS with a nicely sized window, which
   can be computationally complex, our algorithm has a fairly fast run
   time (its O(N) with message size N).

   The protocol assumes that uncompressed messages are composed of
   textual parameters separated by well-defined delimeters. For SIP,
   these delimeters are ; : , @ < > CRLF SP. The encoder breaks the
   message into pieces, where each piece contains no delimiters, or all
   delimeters. These pieces are the finest granularity of elements
   placed into the codebook (i.e., they are the atomic characters of the
   input stream). Once broken into pieces, the encoder finds the longest
   entry in the codebook which matches the top pieces of the message
   (for example, if the message is composed of pieces A, B, and C, and
   the codebook has entries A, B, and AB, the encoder would find AB). If
   a match is not found, the text for the top piece is added to the
   codebook and sent. If a match is found, the encoder finds the next
   longest match. If it discovers that these two matches have occurred
   previously in series in a message, it writes the combination into the
   codebook using the WRITE REFERENCE tokens. This requires the encoder
   to remember, for each entry in the codebook, which entries have
   occured prior to it in past messages. To speed up the longest match
   operation, the encoder uses a hash table of "hints". If a particular
   piece is in the hash table, it means that there exists a codebook
   entry with that piece as a prefix. As a result, to find the longest
   match, the encoder takes the top piece, looks it up in the codebook,
   and notes whether there is a match. If the piece is found in the
   "hints" hash table, the next piece is appended, and the process
   repeats until the lookup fails in the hints hash table.

   The codebook itself is an LRU data store. Entries not recently used
   are removed from the codebook.


Rosenberg                                                     [Page 9]

Internet Draft                  udpcomp                    July 13, 2001


   This simple algorithm gradually creates codebook entries that are the
   longest substrings typically found in messages. These frequently
   recurring substrings stay in the codebook, while message components
   that are never reused eventually fall to the bottom and are removed.

   We ran our encoder against a "friendly" test, which was a series of
   five INVITE requests. Each request is identical to the previous,
   except that each has a different Call-ID, To URL and display name,
   Request URI, and RTP port number. The size of the uncompressed, and
   then compressed, versions of these packets are shown in Table 1.


                Message  Uncomressed size  Compressed Size
                ___________________________________________
                1        405               435
                2        401               158
                3        401               48
                4        401               48
                5        401               48


   Table 1: Compression Results


   The results are pretty good; the algorithm quickly converges upon an
   8.3:1 compression ratio.

6 Future Work

   The primary task is to define additional tokens that can enable a
   wider variety of compression algorithms. For example, if we add a
   token that allows to read N consecutive codebook entries starting
   from a specified index, we should be able to use our protocol to
   support the LZSS algorithm used by Hannu [10]. If a token is defined
   which allows the codebook to be written without using the text as
   part of the decompressed message, the exact LZW algorithm can be
   used, rather than our modified version.

7 Security Considerations

   To our knowledge, this protocol does not introduce any additional
   security considerations. However, this merits further investigation.

8 Authors Addresses


   Jonathan Rosenberg
   dynamicsoft
   72 Eagle Rock Avenue


Rosenberg                                                    [Page 10]

Internet Draft                  udpcomp                    July 13, 2001


   First Floor
   East Hanover, NJ 07936
   email: jdrosen@dynamicsoft.com


9 Bibliography

   [1] M. Handley, H. Schulzrinne, E. Schooler, and J. Rosenberg, "SIP:
   session initiation protocol," Request for Comments 2543, Internet
   Engineering Task Force, Mar. 1999.

   [2] H. Schulzrinne, A. Rao, and R. Lanphier, "Real time streaming
   protocol (RTSP)," Request for Comments 2326, Internet Engineering
   Task Force, Apr.  1998.

   [3] M. Arango, A. Dugan, I. Elliott, C. Huitema, and S. Pickett,
   "Media gateway control protocol (MGCP) version 1.0," Request for
   Comments 2705, Internet Engineering Task Force, Oct. 1999.

   [4] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: a
   transport protocol for real-time applications," Request for Comments
   1889, Internet Engineering Task Force, Jan. 1996.

   [5] S. Casner and V. Jacobson, "Compressing IP/UDP/RTP headers for
   low-speed serial links," Request for Comments 2508, Internet
   Engineering Task Force, Feb. 1999.

   [6] C. Bormann et al.  , "RObust header compression (ROHC)," Internet
   Draft, Internet Engineering Task Force, Feb. 2001.  Work in progress.

   [7] J. Rosenberg and H. Schulzrinne, "Guidelines for authors of SIP
   extensions," Internet Draft, Internet Engineering Task Force, Mar.
   2001.  Work in progress.

   [8] H. Schulzrinne and J. Rosenberg, "SIP: Session initiation
   protocol -- locating SIP servers," Internet Draft, Internet
   Engineering Task Force, Mar. 2001.  Work in progress.

   [9] H. Hannu, J. Christoffersson, and K. Svanbro, "Application
   signaling over cellular links," Internet Draft, Internet Engineering
   Task Force, Mar.  2001.  Work in progress.

   [10] H. Hannu, J. Christoffersson, and K. Svanbro, "RObust GEneric
   message size reduction (ROGER)," Internet Draft, Internet Engineering
   Task Force, Feb. 2001.  Work in progress.


Rosenberg                                                    [Page 11]

Internet Draft                  udpcomp                    July 13, 2001


   [11] Z. Liu, K. Le, K. Leung, and C. Clanton, "Scalable robust
   efficient dictionary-based compression (SCRIBE)," Internet Draft,
   Internet Engineering Task Force, Mar. 2001.  Work in progress.


Rosenberg                                                    [Page 12]