Internet DRAFT - draft-wang-rsvp-state-compression

draft-wang-rsvp-state-compression









INTERNET-DRAFT                                                  Lan Wang
Expires: September 2000                                   Andreas Terzis
<draft-wang-rsvp-state-compression-03.txt>                   Lixia Zhang
                                                                    UCLA
                                                              March 2000

           RSVP Refresh Overhead Reduction by State Compression

                <draft-wang-rsvp-state-compression-03.txt>


Status of this Memo

This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of RFC2026.

Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups.  Note that other groups
may also distribute working documents as Internet- Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time.  It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt

The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.




Abstract

As a soft-state protocol, RSVP specifies that each RSVP node sends peri-
odic RSVP messages for each existing RSVP session.  The overhead due to
such periodic state refreshes goes up linearly with the number of active
RSVP sessions.  One can reduce the overhead by using a longer refresh
period, which unfortunately leads to longer delays in re-synchronizing
RSVP state in face of packet losses.  To overcome the dilemma, this
specification describes the use of a "state-compression" approach to
soft-state protocol refreshes.  In the absence of state changes this
compression mechanism has a constant message overhead while retaining
the soft-state nature of RSVP.





draft-wang-rsvp-state-compression-03.txt                        [Page 1]





INTERNET-DRAFT                                               March 2000


Changes

Below is a list of changes from the previous version of this draft
<draft-wang-rsvp-state-compression-02.txt>.

   - Paragraphs that describe the MessageID extension are revised to be
     consistent with draft-ietf-rsvp-refresh-reduct-03.txt where the
     extension is defined.

   - Section 9 is added to compare our state compression scheme with
     other proposals on RSVP refresh overhead reduction.


1.  Introduction

As a soft-state setup protocol, RSVP specifies that each RSVP node sends
control messages for each active RSVP session to each of its neighbors
serving the same session.  These control messages are sent immediately
whenever a change of state occurs, for example when a new session
starts, when the TSPEC or RSPEC parameters of an existing session
change, or when a route change occurs that affects the paths of existing
RSVP sessions.  In addition, an RSVP node also sends periodic refresh
messages to its neighbors about all existing sessions.  Such periodic
refreshes serve the purpose of keeping consistent RSVP reservation state
along data flow paths.  There are potentially a number of causes that
can make the reservation state between neighbor RSVP nodes out-of-syn-
chronization.  For example:

   - An RSVP message that carries state-change information gets lost.
   - A rare local event inadvertently changed RSVP state values.
   - Other locally induced state changes occur, for example when the
     current Kerberos key expires, a new key is obtained which needs to
     be carried to the next hop by the POLICY_DATA object in the next
     refresh.

Instead of attempting to identify all the potential causes of changes
and handle them individually, RSVP simply uses periodic refreshes to
persistently re-synchronize the reservation state along the path.  This
approach assures consistent reservation state along the data flow path,
without the routers being informed of an exhaustive list of all the pos-
sible causes of state inconsistency or how to correct them individually.

The drawback of this simple and effective approach is the associated
protocol overhead which grows linearly with the number of active RSVP
sessions.  A common mechanism to reduce the periodic refresh overhead is
by using a longer refresh period, e.g. RTCP reports [1] and SDP
announcements [2].  A longer refresh period, however, leads to longer
delays in re-synchronizing state if some RSVP update messages get lost.



draft-wang-rsvp-state-compression-03.txt                        [Page 2]





INTERNET-DRAFT                                               March 2000


In this draft we propose a "state compression" mechanism for refresh
overhead reduction.  Our goal is to improve the efficiency of soft-state
protocols in their communication behavior.  Our design applies compres-
sion algorithms to the protocol state information so that the number of
refresh messages is reduced from being proportional to the number of
sessions to being proportional to the number of neighboring nodes, while
retaining the soft-state nature of RSVP.  Section 7 presents a detailed
overhead analysis of our compression mechanism.


2.  Refresh Overhead Reduction

Even in the absence of new control information generated by sources or
destinations, an RSVP node sends one message per refresh period to its
neighboring nodes for each of all the RSVP sessions.  One way to reduce
this refresh overhead is to compress the state of all RSVP sessions to a
single digest message, which can then be sent to neighboring nodes in
place of all the "raw" refresh messages.  All we need is an algorithm
that can produce a mapping from the given set of RSVP state to a digest
with a low probability of collision.

Two compression algorithms have been suggested so far, CRC-32 and MD5
algorithm [3].  Section 8 presents a brief comparison of the two algo-
rithms.  In the following discussion we assume either one can be used
for the purpose and simply refer to it as "the compression algorithm".
Whenever state inconsistency is detected (the digests of two neighbors
disagree), raw RSVP messages are transmitted to re-synchronize the
state.

Our state compression scheme is designed in accordance with the Mes-
sageID extension of RSVP described in [4]. The MessageID extension
enables nodes to request an ACK for each raw RSVP message that carries
state-change information.  Our design requires that every digest message
be acknowledged either by an ACK message when the neighbor node has a
matching digest value or otherwise by a DigestErr message. It is impor-
tant to note that the use of ACKs in the MessageID extension is to
assure quick delivery of time-sensitive information; RSVP still relies
on periodic refreshes to correct any potential state inconsistency that
may occur even when critical messages are acknowledged, for example
state inconsistency due to undetected bit errors, or due to RSVP state
changes made at local nodes.


2.1.  Definitions

- Input Message





draft-wang-rsvp-state-compression-03.txt                        [Page 3]





INTERNET-DRAFT                                               March 2000


     The input to the compression function.  In this document an input
     message contains some portion of the RSVP state stored at the local
     node.

- Signature

   The result of compression function on an input message.

- Digest

   A set of Signatures that represents a compressed version of the RSVP
   state shared between two neighboring RSVP nodes.


The following definitions are taken from RFC 2209.

- PSB

   Path State Block. Each PSB holds path state for a particular (ses-
   sion, sender) pair, defined by SESSION and SENDER_TEMPLATE objects,
   respectively.

-RSB

   Reservation State Block. Each RSB holds a reservation request that
   arrived in a particular RESV message, corresponding to the triple:
   (session, next hop, Filter_spec_list).



2.2.  Benefits and Constraints of the New Scheme

The design of the new scheme aims at the following features

   - efficient state re-synchronization when two RSVP neighbors discover
     state inconsistency.
   - choice of individual nodes to use either original RSVP refreshes or
     the compressed refresh with overhead reduction, or switch in
     between over time to achieve the best tradeoff.
   - backward compatibility with the current RSVP specification.
   - incremental digest computation when only part of the session(s)
     changes state.

We also recognize the necessary cost and constraints of the new scheme.
First, to assure RSVP state consistency over reserved paths, the digest
must be computed over the RSVP session state rather than over the mes-
sages that are exchanged.  Those messages represent partial, but not all
causes of state changes, two neighbor nodes may end up in inconsistent



draft-wang-rsvp-state-compression-03.txt                        [Page 4]





INTERNET-DRAFT                                               March 2000


state even when RSVP messages are all acknowledged.

Secondly, to reduce the refresh overhead to one refresh message per
refresh period per neighbor, a digest must represent the compressed
state of aggregate sessions.  Thus an RSVP node must first group all
RSVP sessions by each neighbor, compute the digest, and then send to the
corresponding neighbor node.  On the other hand, the paths of individual
RSVP sessions may change over time.  In the current design, RSVP PATH
messages carry the destination address of the session, thus all refresh
messages automatically follow the latest unicast or multicast routing
changes (including multicast group membership changes).  Although RSVP
has an interface to the local routing module for prompt notification of
route changes to minimize the adaptation delay, all RSVP reservations
eventually converge on new paths even in the absence of such notifica-
tions.

When routers stop sending individual PATH refresh messages, however,
RSVP loses its ability to automatically adjust reservations according to
route changes.  The proposed scheme relies on explicit notifications to
any routing changes.  We consider this the major constraint of the new
scheme.

When two neighbor nodes of a unicast RSVP session are directly con-
nected, RSVP can expect to receive notification of route changes through
the interface to the local routing module.  When the two nodes are
interconnected through a non-RSVP cloud, however, there seems no known
way to reliably detect all routing changes without the per session peri-
odic refreshes.  Multicast sessions present a similar problem, namely
that the list of neighbors for a given session may not be known and may
change dynamically as receivers leave and join the session.  Further
complications associated with multicast sessions also arise when a
router is attached to a broadcast LAN.  For example there can be both
compression-capable and compression-incapable downstream neighbors on
the LAN, in which case RSVP must fall back to per session periodic
refreshes.  Furthermore, it remains an open issue whether a router can
reliably detect all changes in the downstream neighbors when a down-
stream router on the broadcast LAN joins or leaves a group without
affecting the list of outgoing interfaces of the associated RSVP state.

Therefore we limit the problem space to handling unicast sessions whose
paths do not cross non-RSVP clouds.  An RSVP node resorts to the current
refresh scheme for all sessions that have the "non_RSVP_cloud" bit set,
or that have a multicast destination address.








draft-wang-rsvp-state-compression-03.txt                        [Page 5]





INTERNET-DRAFT                                               March 2000


3.  Digest Mechanism Description


3.1.  Session Signatures

RSVP maintains path state and reservation state for each active session
[5].  A session is uniquely identified by a session object which con-
tains the IP destination address, protocol ID and optionally a destina-
tion port number of the associated data packets.  A Path State Block
(PSB) is comprised of a sender template (i.e. IP address and port number
of the sender), a Tspec that describes the sender's traffic characteris-
tics, and possibly objects for policy control and advertisements.  A
Reservation State Block (RSB) contains filterspecs (i.e. sender tem-
plates) of the senders for which the reservation is intended, the reser-
vation style and a flowspec that quantifies the reservation.  It may
also contain objects for policy control and confirmation.

As the first step in computing a digest, an RSVP node computes a signa-
ture for each session.  Table 1 shows the objects that should be
included in the digest computation.  These objects must be fed into the
computation procedure in the order as presented in the table.  Although
PSBs and RSBs also contain a few other fields such as incoming interface
and outgoing interfaces, those fields have local meaning and may have
different values at different nodes.  Therefore they must be excluded
from the digest computation.


  RSVP Structure ||    Objects to Include
_________________||_______________________________________
                 ||
     Session     ||  session object
_________________||_______________________________________
                 ||
       PSB       ||  sender template, sender tspec,
                 ||  adspec, policy
_________________||_______________________________________
                 ||
       RSB       ||  filterspec, flowspec, style, policy
_________________||_______________________________________

          Table 1. Objects to Include in Digest Computation



3.2.  State Organization for Digest Computation

As a second step to efficient refresh exchanges, an RSVP node organizes
all the sessions shared with each neighbor node in a hash table.



draft-wang-rsvp-state-compression-03.txt                        [Page 6]





INTERNET-DRAFT                                               March 2000


Assuming the hash table has M slots, each RSVP session is hashed into
one of the M slots.  The hashing is done over some fixed session fields
(e.g the session ID).  If multiple sessions hash to the same slot, they
create an ordered linked list, following the increasing order of IP
address, protocol ID and port number in session objects.

Figure 1 shows the hash table created for each neighbor. In this Figure,
slot i contains a pointer to the head of the linked list of all RSVP
sessions that hash to i.  Assuming the total number of RSVP sessions is
T, the average length of this linked list will be T/M.  Slot i also con-
tains a Signature that is computed by applying the compression algorithm
over the concatenation of the signatures of all the sessions in the
linked list.  If the slot is empty, the signature is zero.  Whenever any
of the sessions in the list changes state, the signature of the slot
must also be recomputed.


              <--------T/M-------->
     +---+    +---+   +---+   +---+
     | 1 |--->|   |-->|   |-->|   |
     +---+    +---+   +---+   +---+
     | . |
     | . |
     | . |
     +---+    +---+   +---+
     | i |--->|   |-->|D_1|
     +---+    +---+   +---+
     | . |
     | . |
     | . |
     +---+
     | M |
     +---+

          Figure 1. Session Hash Table


As a third step towards efficient refresh exchanges, we propose to com-
pute the digest in a structured way.  To illustrate this need, suppose
there are 100,000 sessions and we can transmit 80 signatures in each
digest message.  With a hash table size (M) of 4,000 (25 sessions in
each slot on average), if we periodically exchange all the 4,000 signa-
tures(one for each slot), the overhead will be 50 digest messages per
refresh period.  When session state changes relatively infrequently, it
would be desirable if we can further reduce this refresh overhead.
Therefore, we build a N-ary tree on top of the hash table to compress
the M signatures to no more than N signatures, where N is the number of
signatures that can fit inside a single IP packet.  This set of



draft-wang-rsvp-state-compression-03.txt                        [Page 7]





INTERNET-DRAFT                                               March 2000


signatures is called a digest.  Whenever two neighbor nodes disagree on
the digest value, they can walk down specific branches of the N-ary tree
to locate the portion of inconsistent state in an efficient way.  The
digest computation tree is shown in Figure 2.


     ^         o     o     o          digest
     |        /|\   /|\   /|\
     h   y_1 o o o o o o o o o       level-1 signature
     |      /|\  .........  /|\
     v     o o o ..........o o o     level-0 signatures
         x_1...x_N

          Figure 2. A Digest Tree with N=3


A node constructs the digest computation tree in the following way. The
leaves of the tree are the signatures stored in the slots of the hash
table.  The signatures of N slots are concatenated and the compression
function is applied on the compound message. The result is stored at the
parent node on the tree.  Looking at Figure 2, Signatures x_1, ..., x_N
are concatenated and the result of the compression function is stored in
node y_1.  This grouping results in Ceiling[M/N] level-1 signatures. If
the number of level-1 signatures is still larger than N, the node con-
tinues on to group level-1 signatures to compute Ceiling[M/N^2] level-2
signatures.  If C_i is the number of level-i signatures, we repeat the
grouping until C_i is smaller or equal to N. The top level signatures
represent the digest of the total RSVP state.  In our previous example N
is 80 and M is 4,000, so we have 50 level-1 signatures and the final
digest is comprised of these signatures.

We choose the degree of the tree to be the same as the maximum number of
Signatures in a digest object to simplify the data structure and the
partial rollback process when two nodes detect inconsistent state.  Note
that all RSVP session insertions and deletions are done in the hash
table; the purpose of the digest computation tree is simply to compress
the session signatures to fit them into one packet.

It is important that neighboring RSVP nodes use the same hash table size
M and digest size N (the maximum number of signatures in a single mes-
sage); inconsistency in either of these two values will lead to the
failure of the compression scheme. Therefore, two neighboring nodes need
to choose the M and N to be used before starting exchanging digest mes-
sages.  Ideally, these two constants should be selected based on the
link MTU and the expected number of active sessions.  This specification
assumes that neighbors have reached consensus on the value of N and M
for simplicity.




draft-wang-rsvp-state-compression-03.txt                        [Page 8]





INTERNET-DRAFT                                               March 2000


A digest needs to be recomputed when the following events occur:
   o a new session is added;
   o an existing session is changed, i.e. a state is added to a session,
     or a state is modified or deleted from a session;
   o an existing session is deleted.
The cost of digest computation is summarized in section 7.

An RSVP node needs to compute two digests for each neighbor, i.e. OutDi-
gest and InDigest. OutDigest is computed on the state for which this
node sends refresh messages to that neighbor and it will be included in
the Digest messages sent to the neighbor. InDigest is computed on the
state that is refreshed by messages from the neighbor. It will match the
digest value received from the neighbor if the local node is synchro-
nized with the neighbor.


3.3.  An Example of Digest Computation

In this section, we use a simple example to illustrate how we compute
and update a digest tree.


           T_20               T_21
          /    \             /    \           digest tree (N = 2)
       T_10    T_11       T_12     T_13
      /   \    /   \      /   \    /   \
   +----+----+----+----+----+----+----+----+
   |T_00 T_01 T_02 T_03 T_04 T_05 T_06 T_07|  hash table (8 slots)
   +----+----+----+----+----+----+----+----+
    SS10 SS19 SS12 SS13 SS14 SS25 SS16 SS17   sessions hashed to each slot
    SS18      SS22

   Figure 3. A Digest Tree on top of a Hash Table


Suppose a node currently shares 10 sessions with a neighbor and it uses
a small hash table with 8 slots.  For simplicity, the sessions'
addresses are of the format 1.2.3.j (j = 10, 12, 13, 14, 16, 17, 18, 19,
22, 25) and we call the signature of session j SSj.  The node first maps
the sessions into the hash table (Figure 3 shows the result of the map-
ping).  It then computes a signature for each slot over the session sig-
natures contained in that slot.  For example, T_00 is obtained by com-
pressing the concatenation of SS10 and SS18.  Now we have signatures
T_00 to T_07, but our goal is to reduce these eight signatures to two
signatures assuming N is 2.  Therefore, we separate them into groups of
2 and compute a signature for each group.  This procedure is repeated
twice to produce the final digest (T_20 and T_21).




draft-wang-rsvp-state-compression-03.txt                        [Page 9]





INTERNET-DRAFT                                               March 2000


Now we look at the procedure to update this structure when this node
receives a path message with a destination address of 1.2.3.15 which
creates a new path state.  In addition to forward this path message to
its neighbor, the node calculates the signature of this session (SS15)
and inserts the new session into the hash table.  Let's assume this ses-
sion is mapped to slot 5.  Since the new session's address is smaller
than that of session 25, it's inserted before session 25.  Next, the
node recalculates T_05, T_12 and T_21, i.e. all the signatures on the
path from the leaf signature (slot 5's signature) to the corresponding
root.

To delete a session, the node first needs to locate the session's slot
and remove it from the slot.  The procedure to update the digest tree is
the same as that of adding a session.


3.4.  Neighbor Data Structure

To avoid having to traverse all the RSVP sessions to compute the digest
for a specific neighbor, a router maintains a Neighbor data structure
containing pointers to all the sessions shared with the neighbor.  In
addition to the data structures used in digest computation, the Neighbor
structure holds all the information needed to send and receive digests
to/from a neighboring RSVP node, such as address information of the
neighbor and timers for digest exchanges.  An example of the per neigh-
bor structure is shown in Figure 4. Below is a non-exhaustive list of
the fields in the Neighbor structure.

IP Address

   Holds the IP address of the interface this neighbor uses to communi-
   cate with the local node.

OutSession

   This field points to the hash table holding the sessions to be
   included in the computation of the digest that the local node is
   sending to this neighbor. The sessions included are those that have
   PSBs whose next downstream hop is the neighbor in question and RSBs
   that have the neighbor as the immediate upstream hop.

OutDigest

   A set of pointers that point to the top level signatures of the Out-
   Digest tree for the neighbor.

RefreshOutTimer




draft-wang-rsvp-state-compression-03.txt                       [Page 10]





INTERNET-DRAFT                                               March 2000


   Contains the next time a digest should be sent to this neighbor.





     Neighbor Struct
     +----------------- +
     | IP Address       |
     +----------------- +
     | OutSession       |--->Hash table for outgoing sessions
     +----------------- +
     | OutDigest        |
     +----------------- +
     | RefreshOutTimer  |
     +----------------- +
     | IDLastSent       |
     +------------------+
     | OutDigestTimeout |
     +------------------+
     | InSession        |--->Hash table for incoming sessions
     +------------------+
     | InDigest         |
     +------------------+
     | CleanupInTimer   |
     +------------------+
     |   ...            |
     +------------------+

          Figure 4.  Neighbor Structure



IDLastSent

   Contains the Message_ID of the latest digest sent to this neighbor.

OutDigestTimeout

   Contains the timeout period in milliseconds for the latest digest
   sent to this neighbor. If an acknowledgment is not received by the
   end of this period, the digest should be retransmitted.

InSession

   This field points to the hash table holding the sessions that are
   refreshed by digests received from this neighbor.  Specifically, the
   table includes sessions whose PSBs have this neighbor as previous hop



draft-wang-rsvp-state-compression-03.txt                       [Page 11]





INTERNET-DRAFT                                               March 2000


   and sessions whose RSBs have this neighbor as next hop.

InDigest

   A set of pointers that point to the top level signatures of the InDi-
   gest tree for the neighbor.

CleanupInTimer

   If no digest is received from this neighbor by the time contained in
   the CleanupInTimer field, all associated state should be cleaned up.



3.4.1.  SessDigest Structure

The SessDigest structure contains information about a session that has a
specific neighbor as a Next Hop. The SessDigest structure has the fol-
lowing fields:

Session*

   A pointer to the RSVP session represented by this structure.

Next SessDigest

   When multiple sessions map to the same hash table bin, this field is
   used to point to the next object in the linked list of SessDigest
   objects. Otherwise this field SHOULD be NULL.

P/R

   This field indicates whether the PSB or the RSB of a Session will be
   used in the signature operation.

Signature

   Contains the computed Signature for this session.













draft-wang-rsvp-state-compression-03.txt                       [Page 12]





INTERNET-DRAFT                                               March 2000


Session Struct
             ^
             |
             |
     +-----------------+
     | Session*        |
     +-----------------+
     | Next SessDigest |---->..
     +-----------------+
     | P/R             |
     +-----------------+
     | Signature       |
     +-----------------+

          Figure 5. The SessDigest structure



3.5.  Neighbor Discovery

To use the state compression scheme, a router needs to discover all the
neighbors that are capable of exchanging digests (state compression
capable).  We add a D bit to the common RSVP header for this purpose
(see below).  If a node receives a raw RSVP message with the D bit
turned on from a neighbor, that neighbor is assumed to be state compres-
sion capable.

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |  Vers | Flags |   Msg Type    |         RSVP Checksum         |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |   Send_TTL    |  (Reserved)   |         RSVP Length           |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 Flags: 4 bits

   bb1b: state compression capable (D-bit set to 1)
   bb0b: state compression incapable (D-bit set to 0)

When a non-RSVP cloud exists between two RSVP neighboring nodes, the
upstream node may not be able to detect a change of the downstream
neighbor caused by a unicast route change.  As shown in Figure 6, node A
originally has a downstream neighbor B for a particular unicast RSVP
session.  After a route change, node C becomes A's downstream neighbor.
However, since A's outgoing interface is unchanged, A will not notice
the route change, hence it will continue to include the path state of
sender S when calculating the digest to B. Node B will not be aware of



draft-wang-rsvp-state-compression-03.txt                       [Page 13]





INTERNET-DRAFT                                               March 2000


the change either as long as A sends B the same digest.  C may never get
a path message from A.  Therefore, the digest scheme cannot be used
crossing non-RSVP clouds until an effective way of detecting route
changes is found.  RSVP routers can detect the existence of non-RSVP
clouds between neighbors by comparing the TTL value in the IP header
with the Send_TTL in the RSVP common header, as described in [6].


                  original route
                   ___________   --->
           ->   ->|           |-- B ----| ->
    S - - --- A --| non-RSVP  |         |------- D
                  |  cloud    |         |
                  |___________|-- C ----|
                       new route --->

    Figure 6. An RSVP Session with a non-RSVP cloud



3.6.  Digest Refreshes

RSVP state is refreshed by regular RSVP messages as well as by Digest
messages.  A digest message usually carries multiple signatures, each
representing the compressed state of multiple sessions.  If any signa-
ture in the received digest matches the corresponding local value, all
the RSVP state covered by that signature should be refreshed.  That is,
the local node should carry out the same operation as if one refresh
message for each of the covered sessions has been received.

The refresh rate of digest messages can be different from the refresh
rate of the individual sessions covered by that digest. For digest mes-
sages, the default refresh rate is 30 seconds as suggested by RFC 2205.
If a different refresh rate is used, it SHOULD be carried in the
TIME_VALUES object of the digest message.


4.  New RSVP Message Types and Objects

In this section, we define Digest object, Digest message and DigestErr
message. Since the digest mechanism uses the MessageID objects and ACK
message defined in the MessageID extension [4], we first review the for-
mat and usage of these objects and message to help readers better under-
stand our scheme.







draft-wang-rsvp-state-compression-03.txt                       [Page 14]





INTERNET-DRAFT                                               March 2000


4.1.  MessageID Extension

There are three types of MessageID objects, i.e. MESSAGE_ID, MES-
SAGE_ID_ACK and MESSAGE_ID_NACK. MESSAGE_ID object is used to identify
messages that carry state-changing information. MESSAGE_ID_ACK and MES-
SAGE_ID_NACK are included in ACK messages to acknowledge the correspond-
ing messages. Their definitions are taken from [4].


4.1.1.  MESSAGE_ID Objects

MESSAGE_ID Class = Value to be assigned by IANA of form 0bbbbbbb

MESSAGE_ID object

Class = MESSAGE_ID Class, C_Type = 1


0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     Flags     |                      Epoch                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       Message_Identifier                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


Flags: 8 bits

   0x01 = ACK_Desired flag

   Indicates that the sender requests the receiver to send an acknowl-
   edgment for the message.


Epoch: 24 bits

   A value that indicates when the Message_Identifier sequence has been
   reset.  SHOULD be randomly generated each time a node reboots.  This
   value MUST NOT be changed during normal operation.


Message_Identifier: 32 bits

   When combined with the message generator's IP address, the Mes-
   sage_Identifier field uniquely identifies a message.  The value
   placed in this field changes incrementally and only decreases when
   the Epoch changes or when the value wraps.



draft-wang-rsvp-state-compression-03.txt                       [Page 15]





INTERNET-DRAFT                                               March 2000


4.1.2.  MESSAGE_ID_ACK and MESSAGE_ID_NACK Objects

MESSAGE_ID_ACK Class = Value to be assigned by IANA of form 0bbbbbbb

MESSAGE_ID_ACK object

Class = MESSAGE_ID_ACK Class, C_Type = 1


0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     Flags     |                      Epoch                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       Message_Identifier                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


Flags: 8 bits

   No flags are currently defined.  This field MUST be zero on transmis-
   sion and ignored on receipt.


Epoch: 24 bits

   The Epoch field copied from the message being acknowledged.


Message_Identifier: 32 bits

   The Message_Identifier field copied from the message being acknowl-
   edged.


MESSAGE_ID_NACK object

Class = MESSAGE_ID_ACK Class, C_Type = 2

Definition is the same as the MESSAGE_ID_ACK object.


4.1.3.  Ack Message Format

The Ack message format is as follows:






draft-wang-rsvp-state-compression-03.txt                       [Page 16]





INTERNET-DRAFT                                               March 2000


<ACK Message> ::= <Common Header> [ <INTEGRITY> ]
                  <MESSAGE_ID_ACK> | <MESSAGE_ID_NACK>
                  [ [<MESSAGE_ID_ACK> | <MESSAGE_ID_NACK>] ... ]


For Ack messages, the Msg Type field of the Common Header MUST be set to
13 (This is a suggested value, the permanent value is to be assigned by
IANA).


4.2.  DIGEST Object

DIGEST Class =  Value to be assigned by IANA of form 10bbbbbb

DIGEST object

Class = DIGEST Class, C_Type = 1


0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|S|           Level/Slot        |            Index              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           Reserved            |     Number of Signatures      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
//                    signature list                           //
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


S: 1 bit

   When the S-bit is set to 1, this digest message carries the signa-
   tures of the individual sessions inside a hash slot.


Level/Slot: 15 bits

   When S is set to 1, this is the index of the hash slot that this set
   of signatures belong to. When S is set to 0, this is the level of
   this set of signatures in the corresponding digest tree.  The level
   of a signature is defined as follows:
     1. Leaf signatures are assigned level 0;
     2. The level of a non-leaf signature is (1 + the level of its imme-
        diate children).




draft-wang-rsvp-state-compression-03.txt                       [Page 17]





INTERNET-DRAFT                                               March 2000


Index: 16 bits

   This is the index of the first signature in this digest message.  The
   index of a signature is defined as follows:
     1. For the signatures of individual sessions, the signature of the
        first session in a linked list of sessions contained in the same
        hash slot has index 0 and the following sessions are numbered
        increasingly.
     2. The signatures at the same level of a digest tree are numbered
        increasingly from left to right and the left-most signature is
        assigned index 0.  Level and index uniquely identify the loca-
        tion of a signature in a digest tree.


Reserved: 16 bits

   This field is reserved.  It MUST be set to zero on transmission and
   MUST be ignored on receipt.


Number of Signatures: 16 bits

   Number of Signatures contained in this object.


Signature List:

   A set of signatures that is the compressed state of a group of ses-
   sions. It may be the final digest of a node or some intermediate
   level digest in the digest tree or a set of session signatures.



4.3.  Digest Message

The format of a Digest message is as follows:


<Digest Message> ::= <Common Header> [ <INTEGRITY> ]
                     <MESSAGE_ID> <DIGEST> [ <TIME_VALUES> ]


A Digest message has a common RSVP header with the message type set to
16 (This is a suggested value, the permanent value is to be assigned by
IANA).  It MUST contain a MESSAGE_ID object with the ACK_Requested flag
set and a digest object.  The optional TIME_VALUES object contains the
refresh timer value of the Digest message.




draft-wang-rsvp-state-compression-03.txt                       [Page 18]





INTERNET-DRAFT                                               March 2000


4.4.  DigestErr Message

A DigestErr message has the following format:


<DigestErr Message> ::= <Common Header> [ <INTEGRITY> ]
                        <MESSAGE_ID_NACK> <DIGEST>


The message type of DigestErr message is 17 (This is a suggested value,
the permanent value is to be assigned by IANA).  It MUST contain a MES-
SAGE_ID_NACK object and a DIGEST object.  A DigestErr message acts as a
negative acknowledgment to a Digest message.  The MESSAGE_ID_NACK object
identifies the Digest message being acknowledged.  The DIGEST object
carries the local digest value of the sender of the DigestErr message.


4.5.  Backward Compatibility

Regarding Digest messages, a node should start sending Digest messages
only when it discovers that its particular peer is compression-capable
using the procedure outlined in Section 3.5.


5.  RSVP Message Operations


5.1.  Digest Message

Once a compression-capable node discovers a compression-capable neigh-
bor, instead of sending one refresh message for each PATH/RESV state, it
sends a digest, in one RSVP message per refresh period to each of the
neighbors.  Special treatments for RSVP state containing ADSPEC and POL-
ICY_DATA objects are discussed in section 5.3.  Each digest MUST carry a
MESSAGE_ID object with the ACK_Requested flag set.

When the neighbor receives the Digest message, it first locates the cor-
responding set of signatures in the local digest tree and then performs
a binary comparison between each signature in the message and that in
the local digest tree.

   o If there is a mismatch between any of the signatures received and
     the corresponding local signatures, the node should respond with a
     DigestErr message containing the local digest value. Note that if
     there is any matching signature, RSVP state represented by that
     signature should be refreshed.

   o Otherwise, the receiver is in a consistent state with the sender,



draft-wang-rsvp-state-compression-03.txt                       [Page 19]





INTERNET-DRAFT                                               March 2000


     so it sends an ACK message and refreshes all the state shared with
     the sender.

Digest messages are also retransmitted for a maximum number of times in
the absence of ACK or DigestErr messages. However, following the origi-
nal RSVP design where an RSVP node never stops sending refresh messages
for each active session, a node should not stop sending digest refreshes
even if it fails to receive an acknowledgment in the previous refresh
interval. If the peer node crashed and becomes alive again, it will find
the digest value different from its own and the two routers will start
the re-synchronization process. When the digest value is changed, the
node needs to cancel any pending retransmission of the obsolete Digest
message and promptly send a Digest message with the new digest value.


5.2.  Refresh of ADSPEC and POLICY_DATA

An RSVP message may carry an ADSPEC object that contains resource infor-
mation used by receivers to predict the end-to-end service.  This object
is updated hop-by-hop to gather information along a route.  Since it is
opaque to RSVP, RSVP simply records the received ADSPEC in the corre-
sponding state and, whenever it needs to send a refresh message for a
state, it calls the traffic control module with the recorded ADSPEC to
get an updated ADSPEC object. Therefore, the ADSPEC object sent by a
node may be different from that received by the node.  POLICY_DATA
object is handled in the same way except that it is submitted to the
policy control module.

To handle ADSPEC or POLICY_DATA objects correctly with the digest
scheme, we need to solve the following two problems. First, for each
session with ADSPEC or POLICY_DATA object, we must trigger periodic
updates of ADSPEC or POLICY_DATA objects.  Second, since ADSPEC and POL-
ICY_DATA are opaque to RSVP, RSVP cannot tell whether the objects
returned by the traffic control (or policy control) module changes after
each call.

For the first problem, we treat digest refreshes the same way as regular
RSVP refresh messages. In other words, before a node sends a digest, the
node calls corresponding modules to update these objects. If the traffic
and/or policy control modules provide trigger mechanisms that asyn-
chronously notify the RSVP module when changes occur then this step will
be avoided.

There are three possible solutions to the second problem.  First, a node
can simply assume that the object returned by the traffic control (or
policy control) module changes after each call at refresh period, which
represents a change in RSVP state and hence a regular RSVP PATH state
should be sent to carry the new object value to the next hop.  This



draft-wang-rsvp-state-compression-03.txt                       [Page 20]





INTERNET-DRAFT                                               March 2000


approach requires no changes to RSVP specification but it essentially
falls back to the original RSVP refresh mechanism whenever ADSPEC or
POLICY_DATA object exists in an RSVP session.

The second option is to modify the RSVP/traffic control and RSVP/policy
control interfaces so that they return a change bit that indicates
whether the updated object is different from the previously returned
one.  The previously returned object can be an argument passed to the
interface.  For example, the original RSVP/traffic control interface for
ADSPEC is


Call: TC_Advertise( Interface, Adspec,
Non_RSVP_Hop_flag ) -> New_Adspec,


where Adspec is the ADSPEC from received RSVP messages.  We can change
it to


Call: TC_Advertise( Interface, Adspec, Old_Adspec
Non_RSVP_Hop_flag ) -> New_Adspec, change bit,


and Old_Adspec is the ADSPEC object returned by the previous call.  When
the changed bit is set, a regular RSVP refresh must be sent.

The third option is to let RSVP perform a binary comparison between the
previously and newly returned objects after each call to traffic control
(policy control) module to find out whether the object has changed. This
third option can be a short-term fix while we work on the proposed
changes in RSVP interfaces to other modules. If POLICY_DATA objects are
encoded, for replay attack prevention reasons, in a way that makes iden-
tical POLICY_DATA objects look as different binary objects this third
option will deteriorate to the first option, i.e. a refresh message will
be sent, for every POLICY_DATA object returned by the policy control
module.


6.  RSVP State Re-synchronization

Two RSVP neighbors may become out-of-sync due to a number of reasons.

   - A state-changing RSVP message is lost, and the sender did not ask
     for ACK.
   - A neighbor crashed and lost part or all of its state.
   - Other unknown reasons.




draft-wang-rsvp-state-compression-03.txt                       [Page 21]





INTERNET-DRAFT                                               March 2000


Receipt of a DigestErr message indicates inconsistency between two
nodes.  The MESSAGE_ID and digest value in the DigestErr message help
the two neighbors to localize the problem.

   o If the Message_ID acknowledged is smaller than the Message_ID of
     the last Digest message sent (IDLastSent), this error message is
     for an obsolete message.  This message should be ignored since it
     may not represent the current state of the neighbor.

   o If the Message_IDs are equal, the node should start the recovery
     process we describe next.

The node then compares the signatures contained in the DigestErr mes-
sage.  When it finds the first mismatching signature (call it S_1), it
sends a Digest message containing the lower level signatures in the
digest tree used to compute S_1.  A DigestErr is expected for this
Digest message since at least one of the children signatures does not
match.  The node again looks for the first mismatching signature (S_2)
in the second DigestErr message and sends the children of S_2 in a
Digest message.  This procedure is repeated until the leaf signature
(S_h) (h=log_N(M), see Figure 2) causing the problem is found.  Now, the
node knows that one or more of the sessions in that hash table slot
(represented by S_h) must be inconsistent with those in the neighbor. It
can then either locate these sessions by exchanging the session signa-
tures with the receiver or it can simply send raw refreshes for all the
sessions in that particular bin.  After refreshing these sessions, the
node re-examines S_(h-1) (the parent of S_h) for other inconsistencies
and continues to traverse the tree until all the mismatching sessions
are located and refreshed.

Notice there is a tradeoff between the latency of the recovery procedure
and the transmission efficiency. For example, if the tree has many lev-
els, many RTTs are needed to exchange the digests at all the tree levels
in order to find the leaf-level sessions that contribute to the incon-
sistency. However, if speed of convergence is more important than effi-
ciency, one can stop at an intermediate tree level and refresh all the
state represented by the mismatching signature at that level.


7.  Computation Costs

We have presented in Section 3.2 the structure used to support efficient
and incremental computation of digests. This structure consists of two
parts: a hash table that stores the Signatures of sessions shared with a
particular neighbor and an N-ary tree used to reduce the number of sig-
natures to a number that can be sent inside a single packet. In this
section we focus on the operations applied to these data structures and
analyze their requirements in terms of processing.



draft-wang-rsvp-state-compression-03.txt                       [Page 22]





INTERNET-DRAFT                                               March 2000


We begin with some definitions. Let the number of sessions be T, the
size of the hash table be M and the number of Signatures inside the
digest message be N. Let's further define the cost of computing the com-
pression function on a message of size x to be f(x).  To determine the
behavior of f(x), we have to study each algorithm's behavior. Summariz-
ing the description in RFC 1321 [3], the MD5 algorithm divides the input
message to 64-byte blocks and applies a four-step process to each one of
these 64-byte blocks. In each of the four steps, sixty-four bit-wise
logical operations are applied to that 64-byte block. The results of the
computation on the n-th block are used as input for the computation of
the (n+1)-th block. After all the blocks have been processed, the mes-
sage's digest is produced. From this description, one can see that f(x)
is a linear function of x, the size of the input message measured in
bytes. A similar analysis applies for CRC-32, so the cost of computing
the CRC-32 checksum of a message with size x is also linear on the size
of x but with smaller constants.

When a session is modified, a new signature for that session as well as
a new digest has to be computed. To illustrate this procedure, imagine
that we want to update session D_1 inside the hash table of Figure 1.
First, we look up the session inside the hash table (a possible opti-
mization here is to store the hash table index where the session is
stored rather than re-computing the hash function every time the session
is modified). In our example, we would come up with the index i. If mul-
tiple sessions map to the same hash table slot, we traverse the linked
list of sessions until we find the session in question. Once the session
is found and its new Signature is computed, we have to compute the new
Signature stored at the base of the linked list which represents all the
sessions mapped to that hash table slot. On the average Ceiling[T/M]
sessions will occupy the same slot. The total time needed for this oper-
ation is therefore f(D* Ceiling[T/M]), where D is the size of the signa-
ture (16 bytes for MD5 and 4 bytes for CRC-32). We assume that the
linked-list lookup time which is O(Ceiling[T/M]) is small compared to
the time needed to update the Signatures and therefore we don't include
it in our calculations. The next step is to update the values on the
digest tree. We begin by computing the Signature of the contents of slot
i concatenated with its N-1 siblings which will be stored in their par-
ent node on the digest tree. We continue this procedure until we reach
the top of the tree. Since there are log_N(M) levels on the tree and at
each level we apply the compression algorithm on a message of size D*N
(the combined size of N Signatures), the time spent during this step is
N*(log_N(M)-1)*f(D). Notice that the term is log_N(M)-1 since we do not
calculate a Signature out of the N topmost Signatures.  Also f(D*N) =
N*f(D) since f(x) is linear on x.

From the discussion above, we can conclude that the total time needed to
calculate the new digest after a session is modified is given by the
following formula, where S is the size of a session in bytes:



draft-wang-rsvp-state-compression-03.txt                       [Page 23]





INTERNET-DRAFT                                               March 2000


f(S) + f(D * Ceiling[T/M]) + N*(log_N(M)-1) * f(D) (1)


When a new session has to be inserted in the hash table, we locate the
slot this session hashes to and insert the session to that slot's linked
list, if one exists. Given that the list is ordered, the new session has
to be inserted in order inside the list, which means traversing the list
until we find a session whose destination address is larger than the
destination address of the session we want to add and inserting the new
session before that session. Deleting a session, involves finding the
slot it hashes to, searching for it inside the linked list, and ``splic-
ing'' its predecessor to its successor on the list.

The computation cost for the creation of the new digest after an inser-
tion or deletion operation, is almost identical to the update cost. The
only difference is that in the case of deletion we don't calculate the
Signature of the session (since we are deleting it) but only calculate
the combined Signature of the rest of the sessions on the list. Equa-
tions 2 and 3  respectively, show the insertion and deletion costs.


f(S) + f(D * Ceiling[T/M]) + N*(log_N(M)-1) * f(D) (2)

f(D * Ceiling[T/M]) + N*(log_N(M)-1) * f(D) (3)


We can see from Equations 1, 2 and 3 that when the size M of the hash
table is small compared to the number of sessions T, the cost of updat-
ing the linked list of sessions will be linear to T. In this case,
updating the linked list becomes the most expensive operation, forcing
the total cost to also be linear to T. The size M of the hash table
should therefore be comparable to the expected T to avoid increased
update times.


8.  Compression Mechanism Comparison

We mentioned earlier that two possible candidates for the compression
function are the CRC-32 checksum algorithm and the MD-5 digest algorithm
[3]. The main task of the compression function is to produce a one-to-
one mapping between sets of sessions and digests.  It is therefore
important that the probability of two different configurations producing
the same digest (collision probability) is negligible. The compression
scheme fails when the digest trees created at the two neighbors contain
different sessions but produce the same digest, i.e the same set of N
top level Signatures.

In this section we compute the probability of failure as a function of



draft-wang-rsvp-state-compression-03.txt                       [Page 24]





INTERNET-DRAFT                                               March 2000


the size of signatures produced by the compression algorithm. Specifi-
cally we compute the probability a single bit difference in the RSVP
state shared between the two neighbors is not detected.

Let's assume that the maximum packet size that can be sent un-fragmented
is L bits. Then if the size of the signature is x bits, N, the number of
Signatures in a digest, is L/x. As before, let's further assume that the
size of the hash table is M.  To make the calculations easier we actu-
ally compute the probability that the error will be detected, P(suc-
cess). The failure probability is of course 1-P(success).

The probability that the error will be detected is the probability that
the modified session will produce a different Signature AND the list of
other sessions contained in the same slot with the modified session will
create a different Signature AND all the nodes on the path of the digest
tree from that hash table bin to one of the roots of the digest tree
will create different Signatures. Let's calculate each of these proba-
bilities first.

Assuming that the compression function is perfect, i.e. it distributes
messages evenly over the set of Signatures, the probability that the
changed session will create a different Signature is (1-1/2^x). In a
similar fashion the probability that at each level of the digest tree a
different Signature will be produced is again (1-1/2^x). (Note: one
might observe that CRC-32 detects all single bit differences and there-
fore in this case the probability of detection in the first step is one.
While this is correct, the goal of our simplified analysis is to compare
the relative performance of the two schemes without focusing on the
specifics of each compression algorithm.)

Since the compression function produces a (pseudo-)random set of bits,
the events described above are independent and therefore to compute the
desired probability we take their product. Hence the probability that
the error will be detected is:


P(success) = (1-1/2^x)^(h+1)  (4)

and

P(failure) = 1 -P(success) = 1 - (1-1/2^x)^(h+1) (5)


Where h is the height of the digest tree, h = log_N(M).

To get a better understanding, we substitute the parameters in Equation
5 with some values. For M=10000, L=10000 bits and x=32 (CRC-32), P(fail-
ure) = 3.74*10^-9 while for x=128 (MD5) P(failure)=0.



draft-wang-rsvp-state-compression-03.txt                       [Page 25]





INTERNET-DRAFT                                               March 2000


From this numerical example we can see that for single-bit differences,
MD5 has infinitesimal probability of failure while at the same time
CRC-32 provides high assurance that errors will be detected.  Multiple
bit changes present a slightly different picture. While the detection
probability in the case of MD5 is independent of the number of changed
bits, for CRC-32 the probability of detection decreases when the number
of changed bits increases. The conjecture is therefore that CRC-32 will
perform worse than MD5 for multiple bit differences.

On the other hand the smaller size of the CRC-32 Signature means that N
will be larger and the height h of the digest tree will be significantly
shorter. Finally, computing the CRC-32 checksum is a lot faster than the
MD5 computation.


9.  Comparison with Other Schemes

This section compares our Digest mechanism with the Summary Refresh and
Periodic Checksum mechanisms proposed by Lou Berger et. al. [4]. Both
proposals aim to reduce RSVP's refresh message overhead, but our scheme
can achieve a higher degree of state consistency and higher message
overhead reduction.

The Summary Refresh mechanism uses the message identifier contained in
the MESSAGE_ID object of a trigger message to represent the state infor-
mation that the message carries. To refresh a state, a node can send the
latest message identifier of the particular state in a Summary Refresh
(SRefresh) message. Upon receiving the message, the receiver will look
up the state that corresponds to the identifier and refresh the state as
if a normal refresh message was received. Multiple message identifiers
can be sent in one SRefresh message to refresh multiple state.

The Summary Refresh mechanism looks similar to our Digest mechanism in
that both use some "number" to represent a state, i.e. message identi-
fier and signature respectively. However, they differ in the following
fundamental ways:

- State Consistency

Since message identifier is just a random number associated with a mes-
sage, it is not a true representation of a state. Suppose the message
contains undetected bit errors or either end has made a local state
change after processing the message, then the resulting inconsistency
cannot be detected via the message identifier. On the other hand, since
the Digest mechanism computes the signature from the RSVP state, any
state inconsistency between two nodes can be easily discovered.

- Message Overhead Reduction



draft-wang-rsvp-state-compression-03.txt                       [Page 26]





INTERNET-DRAFT                                               March 2000


Recall that we use a hash table and a digest tree to reduce the size of
a digest so that it can be contained in a single message. Therefore, our
mechanism can reduce the number of refresh messages to one per refresh
period. The Summary Refresh mechanism, on the other hand, does not fur-
ther compress the message identifiers, which means the refresh message
overhead still goes up linearly as the number of state increases.

- Message Processing Overhead

The Summary Refresh mechanism requires checking the message identifiers
of all the state in each refresh period regardless of whether two nodes
are consistent, while the Digest mechanism only needs one comparison of
the Digest value when two nodes have consistent state. When a state
needs to be repaired, the Digest mechanism requires more Digest message
exchanges to identify the state with error. Nonetheless, as you can see,
the digest tree is usually quite shallow, so the number of messages
exchanged during state recovery should be very small.

Given that Summary Refresh mechanism only provides a limited degree of
state consistency, the Periodic Checksum mechanism was introduced to
solve the problem. A checksum is computed for every state and when a
state is changed, its checksum is recomputed. A node also periodically
checks if there are unnoticed changes by computing a new checksum over
the state and comparing it with the existing one. If the two checksums
are different, this node will refresh the corresponding state. In this
way, the node can ensure that valid changes to the state can propagate
to other nodes even though they were previously undetected by RSVP at
the time of change.

However, like the Summary Refresh mechanism, the Periodic Checksum can-
not protect against all forms of errors. If a state is corrupted inter-
nally, which cannot be distinguished from unnoticed valid changes, the
incorrect state information will also be propagated by this mechanism.
What's more, this inconsistency will persist until the node that has the
corrupted state gets refresh messages from its upstream node. Our digest
mechanism does not have this problem, however, since every node is
"forced" to synchronize with its upstream node periodically and all the
nodes will eventually be consistent with the sources of state informa-
tion, e.g. senders for path state and receivers for reservation state.

One may argue that the Digest mechanism has higher storage and computa-
tion overhead because of the need to compute and store digest trees and
hash tables.  However, we would like to point out that the Digest mecha-
nism can operate without using the digest tree and hash table. Nodes can
exchange the plain session signatures without reducing them to a single
digest. Although this will result in a message overhead comparable to
that of the Summary Refresh mechanism, the digest mechanism can still
achieve a higher state consistency than the combination of the Summary



draft-wang-rsvp-state-compression-03.txt                       [Page 27]





INTERNET-DRAFT                                               March 2000


Refresh and Periodic Checksum mechanism.

Another concern about the Digest mechanism is that it may require stan-
dard internal representation of data.  A simple solution for this prob-
lem is to store the data in a machine dependent format and then convert
the data to a standard format before computing its signature.


10.  Summary

Due to the scaling requirements from setting up large numbers of RSVP
reservations, such as the case in MPLS traffic engineering, a few recent
proposals suggested to slow down or eliminate periodic state refreshes
in RSVP.  We believe that elimination of refreshes represents a funda-
mental departure from the soft-state approach adopted in RSVP design.
As we have argued in this draft, reliable RSVP message delivery alone is
insufficient for assuring consistent reservation state inside the net-
work; it merely brings the benefit of quick synchronization whenever
RSVP state changes.

The state compression mechanism described in this draft makes the over-
head of the RSVP soft-state refreshes independent from the number of
RSVP sessions by paying the cost of computing and maintaining a tree of
session signatures.  When sessions or session parameters change, updat-
ing the tree of session signatures has a computational overhead propor-
tional to T/M (T is the total number of sessions and M is a constant
comparable to T).

When large numbers of RSVP sessions exist in a network, we believe that
the proposed scheme brings substantial performance gain from reduced
refresh overhead, especially when most of the sessions are relatively
stable. The feasibility of using this approach in a highly dynamic envi-
ronment where many sessions are constantly added and deleted remains an
open issue.


11.  IANA Considerations

IANA must provide two new Msg Types for the two new messages defined in
this draft: Digest and DigestErr. Also IANA must provide Class Numbers
for the Digest object defined here.  The DIGEST object requires a a
Class-Num value of the form 10bbbbbb.


12.  Security Considerations

No new security issues are raised in this document.  See [5] for a gen-
eral discussion on RSVP security issues.



draft-wang-rsvp-state-compression-03.txt                       [Page 28]





INTERNET-DRAFT                                               March 2000


13.  Acknowledgments

The phrase "overhead reduction by state compression" was suggested by
Van Jacobson of cisco.  Vern Paxson suggested the use of hashing to sim-
plify Digest computation.  Our design also benefited from discussion
with Steve McCanne of UC Berkeley regarding the roles of acknowledgment
in soft-state protocol design.  We thank Hilarie Orman for her insight-
ful comments on MD5 and CRC-32.  We also received valuable feedback from
Bob Braden and Steven Berson of ISI.


14.  References

[1] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson,
    "RTP: A Transport Protocol for Real-Time Applications", RFC 1889,
    January 1996.

[2] Handley, M., and V. Jacobson, "SDP: Session Description      Proto-
col", RFC 2327, April 1998.

[3] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, April
    1992.

[4] Berger, L., Gan, D., et al, "RSVP Refresh Overhead Reduction
    Extensions", draft-ietf-rsvp-refresh-reduct-03.txt, March 2000.

[5] Braden, R. Ed., Zhang, L., et al, "Resource ReserVation Protocol
    -- Version 1 Functional Specification", RFC 2205, September 1997.

[6] Braden, R. and Zhang, L., "Resource ReSerVation Protocol (RSVP)
    -- Version 1 Message Processing Rules", RFC 2209, September 1997.


15.  Authors' Addresses

   Lan Wang
   UCLA
   4805 Boelter Hall
   Los Angeles, CA 90095

   Phone:    310-267-2190
   Email:    lanw@cs.ucla.edu

   Andreas Terzis
   UCLA
   4677 Boelter Hall
   Los Angeles, CA 90095




draft-wang-rsvp-state-compression-03.txt                       [Page 29]





INTERNET-DRAFT                                               March 2000


   Phone:    310-267-2190
   Email:    terzis@cs.ucla.edu

   Lixia Zhang
   UCLA
   4531G Boelter Hall
   Los Angeles, CA  90095

   Phone:    310-825-2695
   Email:    lixia@cs.ucla.edu









































draft-wang-rsvp-state-compression-03.txt                       [Page 30]