RMT Working Group                               M. Luby/Digital Fountain
INTERNET DRAFT                                         L. Vicisano/Cisco
Expires January 13, 2000                      A. Haken/Digital Fountain
                                                              July  2000


              Reliable Multicast Transport Building Block:
                      Multirate Congestion Control
                    <draft-ietf-rmt-bb-mrcc-00.txt>


Status of this Memo

This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of RFC2026. Internet-Drafts are working docu-
ments of the Internet Engineering Task Force (IETF), its areas, and its
working groups.  Note that other groups may also distribute working doc-
uments as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt

The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.

Copyright Notice

   Copyright (C) The Internet Society (2000).  All Rights Reserved.


Abstract

This document describes MRCC, a scalable multirate congestion control
building block for multicast.  MRCC is an approach that allows multiple
receivers to concurrently receive packets from a single sender at vary-
ing rates depending on individual bandwidth connections and network con-
ditions.  Two basic goals of the approach are to allow each receiver to
obtain the full benefit of the available bandwidth to the sender and to
be fair to other flows in the network.

Using MRCC, a sender sends data for a single object to multiple multi-
cast groups, potentially at different rates for each group, where an
object is any well-defined content or file.  The set of multicast groups


Luby, Vicisano, Haken                                           [Page 1]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


carrying data for an object out of a single sender is called a session.
MRCC allows receivers to join or leave an existing session in an asyn-
chronous manner independent of other receivers.  The number of groups
and which groups in the session each receiver joins is dictated by the
local bandwidth availability and network conditions experienced by the
receiver.   In particular, receivers reduce their reception rate as soon
as they feel congestion as evidenced by measured packet loss.

MRCC is receiver driven, i.e., signals are placed into packets by the
sender to indicate to receivers how to react to changing network condi-
tions, and receivers adjust their reception rate in response to these
signals and packet loss. Thus, each receiver experiences a reception
rate appropriate to that receiver independent of other receivers.

MRCC has the following properties:


o To each receiver, it appears as if there is a dedicated unicast ses-
  sion from the sender to the receiver, where the reception rate adjusts
  to congestion along the path from sender to receiver similar to TCP.

o To the sender, there is no difference in load or outgoing rate if one
  receiver is joined to the session or a million (or any number of)
  receivers are joined to the session, independent of when the receivers
  join and leave.

o For each link in the network, the packet traffic from the session and
  its reaction to competing traffic is the same whether there is one
  receiver or a million receivers beyond the link, and this reaction is
  similar to how TCP reacts.

  Thus, MRCC provides a massively scalable multirate congestion control
  approach that is network friendly.

1. Introduction

This document describes a scalable multirate congestion control (MRCC)
building block that can be used by applications built on top of IP mul-
ticast [DEE88].  MRCC can be used for example as the congestion control
scheme for ALC [ALC00]. Many congestion control schemes have been built
on top of multicast. However, scalability is not a design goal for many
of these congestion control schemes, in the sense that they require all
receivers to be receiving at the same rate, and in general this rate is
dictated by the available bandwidth along the most constrained path from
the sender to a receiver.  Thus, the reception rate of receivers is
restricted to that of the worst case receiver.  In contrast, MRCC is
designed so that each receiver can be receiving at the rate that is dic-
tated by the available bandwidth along the path from the sender to that


Luby, Vicisano, Haken                                           [Page 2]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


receiver.  Thus, the reception rate of receivers is independent of that
of other receivers.

One of the key difficulties in scaling sender-based multicast congestion
control schemes is dealing with the amount of data that flows from
receivers back to the sender to adjust the sender rate. MRCC avoids any
such feedback, and thus is massively scalable.

An attractive feature of a scalable congestion control scheme is the
ability for different receivers to join and leave the session asyn-
chronously without adversely affecting the reception experience of other
receivers and without affecting the scalability of the scheme.  This is
one of the features provided by MRCC.

To transmit data about a particular object using MRCC, a sender sends
the data concerning a single object to one or more multicast groups.
The data sent to different multicast groups may vary.  The set of groups
pertaining to a particular object emanating from a single server is
called a session. As described below in detail, receivers MUST join and
leave groups within a session to vary their reception rate in the face
of varying bandwidth capacity between the receivers and the sender.

The original ideas for MRCC are from [VIC98A], [VIC98B], [BYE00],
[HOR00] and [MCC96], for the use of multicast layers.  The sender places
congestion control information into the header of each packet sent to
the session.  The set of groups a receiver joins is determined by the
receiver based on signals placed into packets by the sender and by loss
measured along the path from the sender to that receiver.  Receivers
that can receive packets at a rate higher than their current rate are
allowed to periodically increase their reception rate, and receivers
that are receiving packets at a higher rate than they have the capacity
for (as evidenced by packet loss) MUST reduce their rate.

This document introduces two congestion control schemes, a static layer
scheme and a dynamic layer scheme.  The static layer scheme is applica-
ble to networks where joins to multicast groups and leaves from multi-
cast groups are processed quickly, e.g., on the same time scale as the
round trip time from the receiver to the sender.

The dynamic layer scheme is applicable to networks where joins to multi-
cast groups are process quickly but leaves from multicast groups may
take several seconds.  One of the current problems with many implementa-
tions of IP multicast routing protocols is that the IGMP protocol used
between receiver and access routers to join and leave groups has the
limitation that IGMP leave messages can take a significant amount of
time to take effect, leading to a large amount of data flowing to a
receiver on a given group long after the receiver has issued an IGMP
leave message for that group in reaction to packet loss.  This is a


Luby, Vicisano, Haken                                           [Page 3]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


problem because the network remains congested for the period of time it
takes to process the IGMP leave request, thus making it difficult to
rely on IGMP leave requests to provide a network friendly congestion
control scheme.  The dynamic layer scheme uses groups where the trans-
mission rates on the groups change over time in a way that makes it pos-
sible for receivers to quickly reduce their reception rate by taking no
action.  Instead, receivers that want to maintain their current recep-
tion rate must periodically issue IGMP join requests, and receivers that
want to increase their reception rate must issue an additional IGMP join
request.

One of the attractions of MRCC is that it is multicast routing indepen-
dent and that it does not require multicast reverse connectivity, i.e.
MRCC receivers do not send multicast traffic or any other traffic for
purposes of congestion control. In particular, MRCC works with the orig-
inal multicast model introduced in [DEE88], which we call Internet Stan-
dard Multicast (ISM) in this document, and with the Source Specific Mul-
ticast (SSM) model that is based on [HOL99].  The definition of a MRCC
group that is used throughout this document is slightly different with
ISM and with SSM.  When using ISM, packets of an MRCC group are sent to
a multicast group address G.  When using SSM, packets of an MRCC group
are sent to a channel address (S,G), where S is the IP address of the
sender and G is a multicast group address.

SSM is more attractive to MRCC than ISM for a few reasons.  First, a
session is made up of multiple MRCC groups, and MRCC can be used to
deliver a large number of objects over time that use different sets of
MRCC groups for the transmission.  With ISM, the multicast group address
G that corresponds to an MRCC group must be allocated so that it is
unique across the Internet.  With SSM, the multicast group address G can
be allocated locally by the sender with the only requirement that it is
unique to the sender, because it is the (S,G) channel that corresponds
to the MRCC group that a receiver joins.  Second, MRCC supports an
unlimited number of receivers that are dynamically joining and leaving
MRCC groups.  Changes in the multicast tree topology with SSM are light
weight operations (a new branch from the receiver towards S grows when a
receiver joins, and the branch is deleted when the receiver leaves), and
with ISM changes can be heavier weight (involving transitions from a
(*,G)-tree rooted at an RP to the tree rooted at S).  Third, MRCC is
scalable to an unlimited number of receivers that may span the global
Internet. Thus, the light weight mechanisms that SSM uses to cross ISP
boundaries (standard BGP+ routing tables) is a distinct advantage over
the heavier weight mechanisms used by ISM (the MSDP and BGMP protocols,
both of which are not needed by SSM).  Finally, a receiver joins an MRCC
group by joining a channel (S,G) with SSM, and thus the receiver will
only receive packets sent from the sender S. With ISM, the receiver
joins an MRCC group by joining a multicast group G, and all packets sent
to G, regardless of their origin sender, will be received by the


Luby, Vicisano, Haken                                           [Page 4]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


receiver.  Thus, if another sender application is inadverdently sending
packets to G at a high rate, these packets will be sent over portions of
the network and delivered to the receiver even though they were not
requested by the receiver, potentially resulting in a large amount of
unintended network congestion.  Thus, SSM has compelling advantages over
ISM for prevention of unintended and wasteful network congestion.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [R2119].

2. General Architecture

The Multirate Congestion Control (MRCC) schemes described in this docu-
ment describes a congestion control schemes that is to be applied to the
reception of a single object from a single sender.  To transmit data
about a particular object using MRCC, a sender sends the data concerning
a single object to one or more multicast groups.  The rate of transmis-
sion to different multicast groups may vary.  The data sent to a set of
groups pertaining to a particular object by a single server is called a
session.  Typically, the sender continues to send data to all groups in
a session until the transmission is complete.  The transmission may be
considered complete when some time has expired, a certain amount of data
has been sent, or some out of band signal (from a higher level protocol,
perhaps) has indicated completion by a sufficient number of receivers.
The sender(s) may then proceed to additional object transfers, or termi-
nate.

It is possible that a receiver may join multiple sessions to receive
data from multiple senders pertaining to the same object.  For example,
three different sessions pertaining to a particular object could be
transmitted from three different senders, each session consisting of
four groups to which data is being sent at different rates.  A receiver
may join and receive packets from all 12 groups concurrently until it
has enough packets in total to recover the object.  However, since the
senders may be located at different points in the network that experi-
ence varying network conditions, a receiver MUST perform congestion con-
trol independently on each session it is receiving.

All packets sent to a session MUST be the same size. The sender(s) may
determine the size arbitrarily, but must coordinate or use a default
size to ensure that all packets are of equal length. Larger packet sizes
are generally desirable so that the fraction of bandwidth lost to packet
headers is reduced.  On the other hand, if the packet size is larger
than the network's maximum transmission unit (MTU), packets would be
fragmented, and the loss of any fragment would require the entire packet
to be lost.  Therefore a packet size close to, but not exceeding, the
MTU is best, as this reduces packet header overhead and packet handling


Luby, Vicisano, Haken                                           [Page 5]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


overhead in routers.

A receiver must first obtain an object transmission session description
before starting to download an object.  This includes the information
about the groups associated with a session that is needed in order to
join those groups.  Once a receiver has obtained this information the
receiver may join one of the groups in the session to initiate the down-
load of an object. In order to be in compliance with MRCC, receivers
MUST join and leave groups within a session as described in detail in
this document in order to vary their reception rate in the face of vary-
ing bandwidth capacity between the receivers and the sender.

The object transmission session description is determined and agreed
upon by the senders and communicated to the receivers out of band, or,
in some cases, included or partially included in the header of each
packet. The session description could include the object name, the
object length, the packet format and length, and the multicast
address(es) of the groups in the session associated with the object.
The session description could be in a form such as SDP [HAN98]. We
assume that there exists an out of band mechanism for receivers to
obtain the session description. The session description might be carried
in a session announcement protocol such as SAP [HAN96], located on a Web
page with scheduling information, or conveyed via E-mail or other out of
band methods. Discussion of session description format, and distribution
of session descriptions is beyond the scope of this document.

All MRCC packets MUST must contain the MRCC congestion control informa-
tion in their header as described in this document.

MRCC packets are sent over the network encapsulated within UDP packets
sent with an IP multicast address.

3. Overview of congestion control schemes

MRCC performs congestion control by dedicating multiple MRCC groups to a
session.  Receivers joined to the session are subject to heterogeneous
reception rates, obtained by having the receiver selectively join a sub-
set of all the groups available from the sender.

The original ideas for both of the MRCC congestion control schemes are
from a combination of [VIC98A, VIC98B, BYE00, HOR00, MCC96].

When a sender is instructed to start a session for an object, the range
of possible reception rates is specified by rmin and rmax, where rmin is
the minimum available reception rate and rmax is the maximum available
reception rate.  It is recommended that rmin be small enough so that any
receiver can receive at rate rmin without incurring significant packet
loss.


Luby, Vicisano, Haken                                           [Page 6]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


If rmin = rmax is specified then there is only one reception rate speci-
fied for the session.  If rmax > rmin is specified, then multiple recep-
tion rates should be made available in the session. Let X = 1.3.  It is
recommended that the number of available reception rates be set to A+1,
where A is the largest value such that rmin * X^A <= rmax, and that for
all i=0,...,A the reception rate R(i) be set to rmin * X^i.

Let A+1 be the number of different reception rates available for a ses-
sion, and let R(0) < ... < R(A) be the set of available reception rates.
As an example, set R(0) = 24 Kbps, set X = 1.3, set A=30, and for all
i=1,...,30 set R(i) = R(0) * X^i.  Then, R(30) = 62.9 Mbps, and thus the
ratio of the maximum achievable rate to the minimum rate is a factor of
2,620 in this example. When a receiver is receiving packets at reception
rate R(i), we say the receiver is at layer i.  Let r(0) = R(0), and for
all i = 1,..., A, let r(i) = R(i) - R(i-1) be the incremental rate
needed to increase the rate from layer i to layer i+1.  Suppose the cur-
rent reception rate of a receiver is R(i), i.e., the receiver is at
layer i.  If the receiver is to increase its reception rate to layer
i+1, it does so increasing its reception rate by r(i+1) to R(i+1).  If
the receiver is to decrease its reception rate to layer i-1, it does so
decreasing its reception rate by r(i) to R(i-1).  If the receiver is to
maintain it reception rate at layer i, it does not change its reception
rate from R(i).

4. Static layer scheme

This section provides a detailed operational description of how to apply
the MRCC design principles to congestion control when IGMP leave latency
is short.  The next section described extends these principles to
describe a second way to achieve congestion control that overcomes long
IGMP leave latencies.

4.1. Sender operation

The sender uses A+1 groups numbered 0,..., A.  For all i = 0,...,A, the
ith group carries packets at rate r(i) at all times.  Group 0 is
referred to as the base group.

The sender partitions time into equal duration intervals called time
slots.  The time slot duration TSD determines the reaction time of
receivers to changing network congestion conditions.  It is recommended
that the time slot duration TSD be set to one of either 0.5, 1.0, or 2.0
seconds.  Associated with each time slot is the time slot index.  The
range of values for the time slot index is [0..G-1] for some value G.
The time slot index increments by one modulo G between each consecutive
time slot.  For example, if G = 32 then the time slot index is 0, 1, 2,
3, ... , 30, 31, 0, 1, ... in consecutive time slots.  It is recommended
that G be set to 128.  G must be set to at least 3.


Luby, Vicisano, Haken                                           [Page 7]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


In order to be able to measure loss within each group, the sender places
consecutive sequence numbers in the packets sent to a group. The
sequence numbers ignore time slot boundaries, i.e., the sequence numbers
within the same group across the time slot boundary are still consecu-
tive. Sequence numbers wrap around at 2^16, i.e., the consecutive
sequence numbers are 0, 1, 2, ..., 2^16-2, 2^16-1, 0, 1, 2, ... .

The sender places into each packet the MRCC group number of the packet.
Thus, all packets sent to the same group must have the same group num-
ber.

The sender places into each packet the time slot index.  Thus, all pack-
ets within the same time slot must have the same time slot index.

The sender places an increase signal trigger into each packet.  The
increase signal trigger is set to either 0 or 1 for each packet.  The
increase signal trigger must be the same for all packets sent to a group
within the same time slot.  An increase signal trigger of 0 indicates no
increase allowed to receivers, and an increase signal trigger of 1 indi-
cates increase is allowed to receivers.

The increase signal triggers are calculated as follows by the sender.
For all i = 0,...,A-1, let p(i) = min {1.0, 20*packet size*TSD/R(i)},
and set p(A) = 0.  Note that 1 >= p(0) >= p(1) >= ... >= p(A-1) >= p(A)
= 0.  Let B be an integer associated with each time slot that increases
by one for each consecutive time slot, and thus t = (B mod G) is the
time slot index.  For each time slot, for each i = 0,...,A, for all
packets sent to the group i that carries rate r(i) during the Bth time
slot, the increase signal trigger is set to 1 if BB <= p(i), and the
trigger is set to 0 otherwise.  Here, BB is derived from B by writing B
in reverse binary notation and considering it as a fraction that is
between 0 and 1.  For example, if B = 253 when written base ten, then
when written in binary B = 11111101, and then BB = 0.10111111 when writ-
ten as a binary fraction.  As a decimal fraction, BB = 0.7421875.  This
method of computing BB from B, where B increases by one at each time
slot, guarantees that increase signal triggers equal to 1 for a given
layer i are very well-spaced out over the time slots, and that on aver-
age the fraction of time slots with increase signal 1 for layer i is
p(i).  Note that increase signal trigger for the group carrying rate
r(A) during a time slot is always set to 0, since p(A) = 0.  This
ensures that there is never an attempt by a receiver to increase the
reception rate above R(A).

This method of setting the increase signal triggers implies the follow-
ing monotonicity property: if the trigger is set to 1 for layer i during
some time slot, then the trigger is also set to 1 for lower layers i' <
i in the same time slot.  This allows receivers at lower layers behind a
bottleneck link to increase to a higher layer when receivers at higher


Luby, Vicisano, Haken                                           [Page 8]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


layers increase to a higher layer.  During other time slot periods,
receivers at lower layers will be allowed to increase to a higher layer
when receivers at higher layers aren't allowed to increase to a higher
layer, thus giving receivers at lower layers a chance to catch up.

4.2. Receiver operation


When a receiver first joins a session, it must only join the base group
and remain joined only to the base group for at least one complete time
slot. The rate r(0) = R(0) of the base group must be small enough that
when a receiver is joined to just this base group there is no signifi-
cant packet loss.  If there is significant packet loss over any signifi-
cant period of time when the receiver is only joined to the base group
then the receiver must leave the session by leaving the base group.  If
there is only one reception rate offered by the session, i.e., there is
only a base group offered by the session and no other groups, then only
this first paragraph is relevant to the receiver congestion control
scheme.  If there are two or more reception rates offered by the ses-
sion, i.e. there is a base group and (at least three) other groups, then
the rest of this subsection is relevant to the receiver congestion con-
trol scheme.

The receiver must keep a timer that tracks the maximum interarrival time
between packets.  Whenever there is an interarrival time that exceeds
TSD, the receiver must leave the session by leaving all groups in the
session immediately.  The receiver may thereafter try to rejoin the ses-
sion.

During a generic time slot t a receiver is joined to some number i, 0 <=
i <= A, of the groups 0, ..., i that have rates r(0), r(1),
r(2),...,r(i), respectively, within time slot t.  Thus, during time slot
t, the receiver is at layer i and has a reception rate R(i). The
receiver does not join or leave any groups in the middle of time slot t.
To simplify the following discussion, let t+1 indicate t+1 mod G. The
receiver only makes changes in group membership at the beginning (indi-
cated by the first packet received) of the next time slot t+1.

If there is at least one packet loss measured in time slot t in any
group then the receiver must leave group i at the beginning of the next
time slot t+1. This will drop the reception rate for the receiver from
layer i to layer i-1, i.e., the reception rate will drop from R(i) to
R(i-1).

If there is no measured packet loss in time slot t then the action of
the receiver depends on the increase signal trigger in group i in time
slot t.  If the increase signal trigger is 0 for group i in time slot t,
indicating the receiver must not increase above layer i, the receiver


Luby, Vicisano, Haken                                           [Page 9]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


does not join or leave any groups at the beginning of time slot t+1. If
the increase signal trigger is 1 for group i in time slot t (in which
case i < a, since the increase signal trigger is always 0 for group A),
indicating the receiver can increase to layer i+1, then the receiver
joins group i+1 at the beginning of time slot t+1.

4.3. General considerations

Generally, the multicast group addresses associated with the MRCC groups
constitute a consecutive range of multicast address space.  For example,
the 21 MRCC groups [0..20] may be bound to the SSM channel addresses
(192.35.134.26, 232.153.220.0) through (192.35.134.26, 232.153.220.20).
However, it is not a requirement that these multicast group addresses be
consecutive.  There can be at most 256 MRCC groups associated with an
MRCC session using the static layer scheme, because there are 8 bits
available for specifying groups in the abstract MRCC packet header.

The number of MRCC groups associated with a session, and the addresses
of the multicast groups or channels bound to these groups, must be part
of the session description information communicated out of band.

5. Dynamic layer scheme

We use several ideas to circumvent the problems associated with long
leave latency.  As before, let A+1 be the number of different reception
rates available for a session, and let R(0) < ... < R(A) be the set of
available reception rates.  One idea is to allocate G+1 MRCC groups to
the session, numbered from 0 to G, where G > A.  MRCC group 0 is called
the base group, and this group always carries packets at rate r(0) =
R(0).  MRCC groups 1 thru G are called dynamic groups, because over time
they carry packets at different rates.  At each point t in time, for all
i = 1, ..., A, there is exactly one dynamic group that carries the rate
r(i), and these groups are called active groups at time t.  The remain-
ing Q = G-A dynamic groups carry zero rate at time t, and these groups
are called quiescent groups at time t.  Which dynamic groups are active
and what rate they carry and which groups are quiescent varies over time
in such a way that the problem of long leave latency is overcome.

5.1. Sender operation


As for the static layer scheme, the sender partitions time into equal
duration intervals called time slots.  The time slot duration TSD deter-
mines the reaction time of receivers to changing network congestion con-
ditions.  It is recommended that the time slot duration TSD be set to
one of either 0.5, 1.0, or 2.0 seconds.  Associated with each time slot
is the time slot index.  The range of values for the time slot index is
[0..G-1], i.e., the number of possible time slot indices is equal to the


Luby, Vicisano, Haken                                          [Page 10]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


number of dynamic groups. The time slot index increments by one modulo G
between each consecutive time slot.  For example, if G = 32 then the
time slot index is 0, 1, 2, 3, ... , 30, 31, 0, 1, ... in consecutive
time slots.

Given the definition of a time slot and time slot index, we can now
define how the rates on the dynamic groups vary over time.  For i =
A+1,...,G, define r(i) = 0.  The idea is that dynamic group j in time
slot t carries packets flowing at rate r(((j-t-1) mod G)+1).  Thus, in
the time slots with indices 0, 1, 2, ..., G-A-1, G-A, G-A+1, ..., G-1,
dynamic group G carries packets at rate r(G) = 0, r(G-1) = 0, r(G-2) =
0,...,r(A+1) = 0, r(A), r(A-1),...r(1), respectively.  Thus, dynamic
group G is quiescent for Q = G-A time slots, then carries rate r(A),
r(A-1), r(A-2), ..., r(1) over the subsequent A time slots.  This same
pattern is then cyclically repeated thereafter.  Each of the other G-1
dynamic groups goes through the same cycle, where the cycle for dynamic
group G-1 is shifted one time slot forward from dynamic group G, and in
general the cycle for dynamic group j is shifted one time slot forward
from dynamic group j+1.  Thus, during each time slot t, for each i =
1,...,G, dynamic group ((i+t-1) mod G) + 1 carries packets at rate r(i).
Thus, during each time slot A groups are active and Q are quiescent.

The reason for this organization of the dynamic groups is that it allows
receivers to quickly decrease their reception rate within one time slot
without requiring small leave latencies.  For 1 <= i <= A, suppose a
receiver is joined to dynamic group j = ((i+t-1) mod G)+1 at time t in
order to receive at rate r(i).  Then, over the i-1 subsequent time slots
t+1, t+2,..., t+i-1 the reception rate of the receiver from group j is
r(i-1), r(i-2),...,r(1).  Then, in the next time slot t+i, the rate of
group j drops to zero and remains there for a total of Q time slots.
The idea is that once a receiver is joined to a dynamic group, the
receiver remains joined to the group independent of all other factors
such as packet loss until the group becomes quiescent, and at this point
in time the receiver immediately leaves the group.  Let LL be the upper
bound on the leave latency, i.e., LL is an upper bound on the time
between when a receiver issues a leave to a group and the time when the
leave takes effect.  Then, in order for the leave to take effect before
the group becomes active again in Q time slots, it must be the case that
LL <= (Q-1) * TSD.  For example, if LL = 10 seconds and TSD is set to 1
second, then Q must be at least 11.   Once a receiver joins a dynamic
group it remains joined to the dynamic group until it becomes quiescent,
at which time the group is left and this takes effect before the group
becomes active again.  As we describe below in the receiver congestion
control scheme, with this property long leave latencies are no longer a
problem, as the reaction time to network congestion is at most one time
slot.

Suppose there is only one reception rate available, i.e., rmin = rmax.


Luby, Vicisano, Haken                                          [Page 11]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


Then the base group is used to transmit at rate rmin and no dynamic
groups are used in the session, and A = 0, Q = 0, and thus G = 0.  When
only the base group is used, the group number in each packet is set to
zero, the time slot index in each packet is set to zero and the increase
signal trigger in each packet is set to zero.  The sequence numbers are
used as normal within the base group, as described below.

The rest of this subsection concerns the case when rmax > rmin, i.e.,
when multiple reception rates are available in the session.  Since there
are at least two different reception rates, A >= 1.  Let LL be the maxi-
mum leave latency.  Based on TSD and LL, Q must be at least LL/TSD + 1.
For example, if LL is 9.3 seconds and TSD is 1.0 second then the minimum
possible value for Q is 11.  Based on this, Q >= 2, and thus the number
G = Q+A of dynamic groups must be at least 3.

In order to be able to measure loss within each MRCC group, the sender
places consecutive sequence numbers in the packets sent to a group. The
sequence numbers ignore time slot boundaries, e.g., even though the rate
of packets sent to a dynamic group changes when the time slot changes,
the sequence numbers within the group across the time slot boundary are
still consecutive. Sequence numbers wrap around at 2^16, i.e., the con-
secutive sequence numbers are 0, 1, 2, ..., 2^16-2, 2^16-1, 0, 1, 2, ...
.

The sender places into each packet the MRCC group number of the packet.
Thus, all packets sent to the same group must have the same group num-
ber.

The sender places into each packet the time slot index.  Thus, all pack-
ets within the same time slot must have the same time slot index.

The sender places an increase signal trigger into each packet.  The
increase signal trigger is set to either 0 or 1 for each packet.  The
increase signal trigger must be the same for all packets sent to a
dynamic group within the same time slot.  An increase signal trigger of
0 indicates no increase allowed to receivers, and an increase signal
trigger of 1 indicates increase is allowed to receivers, in a manner
described  in detail in the description of the receiver congestion con-
trol scheme.

The increase signal triggers that indicate to receivers when to increase
from a given layer to the next layer are calculated as follows by the
sender.  For all i = 0,...,A-1, let p(i) = min {1.0, 20*packet
size*TSD/R(i)}, and set p(A) = 0.  Note that 1 >= p(0) >= p(1) >= ... >=
p(A-1) >= p(A) = 0.  Let B be an integer associated with each time slot
that increases by one for each consecutive time slot, and thus t = (B
mod G) is the time slot index.  For the  time slot, for each i =
0,...,A, for all packets sent to group j = ((i+B-1) mod G)+1 that


Luby, Vicisano, Haken                                          [Page 12]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


carries rate r(i) during the Bth time slot, the increase signal trigger
is set to 1 if BB <= p(i), and the trigger is set to 0 otherwise.  Here,
BB is derived from B by writing B in reverse binary notation and consid-
ering it as a fraction that is between 0 and 1.  For example, if B = 253
when written base ten, then when written in binary B = 11111101, and
then BB = 0.10111111 when written as a binary fraction.  As a decimal
fraction, BB = 0.7421875.  This method of computing BB from B, where B
increases by one at each time slot, guarantees that increase signal
triggers equal to 1 for a given layer i are very well-spaced out over
the time slots, and that on average the fraction of time slots with
increase signal 1 for layer i is p(i).  Note that increase signal trig-
ger for the dynamic group carrying rate r(A) during a time slot is
always set to 0, since p(A) = 0.  This ensures that there is never an
attempt by a receiver to increase the reception rate above R(A).

This method of setting the increase signal triggers implies the follow-
ing monotonicity property: if the trigger is set to 1 for layer i during
some time slot, then the trigger is also set to 1 for lower layers i' <
i in the same time slot.  This allows receivers at lower layers behind a
bottleneck link to increase to a higher layer when receivers at higher
layers increase to a higher layer.  During other time slot periods,
receivers at lower layers will be allowed to increase to a higher layer
when receivers at higher layers aren't allowed to increase to a higher
layer,  thus giving receivers at lower layers a chance to catch up.

5.2. Receiver operation

When a receiver first joins a session, it must only join the base group
and remain joined only to the base group for at least one complete time
slot. The rate r(0) = R(0) of the base group must be small enough that
when a receiver is joined to just this base group there is no signifi-
cant packet loss.  If there is significant packet loss over any signifi-
cant period of time when the receiver is only joined to the base group
then the receiver must leave the session by leaving the base group.  If
there is only one reception rate offered by the session, i.e., there is
only a base group offered by the session and no dynamic groups, then
only this first paragraph is relevant to the receiver congestion control
scheme.  If there are two or more reception rates offered by the ses-
sion, i.e. there is a base layer and (at least three) dynamic layers,
then the rest of this subsection is relevant to the receiver congestion
control scheme.

The receiver must keep a timer that tracks the maximum interarrival time
between packets.  Whenever there is an interarrival time that exceeds
TSD, the receiver must leave the session by leaving all groups in the
session immediately.  The receiver may thereafter try to rejoin the ses-
sion.


Luby, Vicisano, Haken                                          [Page 13]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


During a generic time slot t a receiver is joined to the base group at
rate r(0) and  some number i, 0 <= i <= A, of dynamic groups that have
rates r(1), r(2),...,r(i), respectively, within time slot t.  Thus, dur-
ing time slot t, the receiver is at layer i and has a reception rate
R(i). The receiver does not join or leave any groups in the middle of
time slot t.  The receiver only makes changes in group membership at the
beginning (indicated by the first packet received) of the next time slot
after t.

If the receiver is joined to one or more dynamic groups in addition to
the base group (i >= 1) in time slot t then at the beginning of the next
time slot after t the receiver must leave the dynamic group that was
carrying packets at rate r(1) during time slot t, i.e. the receiver must
leave dynamic group j = (t mod G) + 1.  Dynamic group j will be quies-
cent for Q time slots after the end of time slot t, and thus by the time
group j becomes active again, the leave request will have been pro-
cessed.

If there is at least one packet loss measured in time slot t then the
receiver must not join any groups at the beginning of the next time slot
after t.  If the receiver is joined to at least one dynamic group in
addition to the base group (i >= 1) then the effect of this inaction
will be to drop from layer i to layer i-1, i.e., the reception rate will
drop from R(i) to R(i-1).  This is because, for all 2 <= i' <=i, the
dynamic group that carries rate r(i') in time slot t will carry rate
r(i'-1) in the time slot after t, and the dynamic layer that carries the
rate r(1) in time slot t drops to zero and remains at zero for Q time
slots at the end of time slot t.  The net effect is that the reception
rate drops by r(i) at the beginning of the time slot after t, and thus
the reception rate drops from R(i) to R(i-1).

If there is no measured packet loss in time slot t then how many groups
the receiver joins at the beginning of the time slot after t depends on
the increase signal trigger for layer i in time slot t.  The increase
signal trigger for layer i in time slot t is carried in packets in group
((i+t-1) mod G)+1, since this is the group that is carrying packets at
rate r(i) in time slot t.  If the increase signal trigger is 0 for layer
i in time slot t, indicating the receiver must not increase above layer
i, the receiver joins dynamic group j = ((i+t) mod G)+1 at the beginning
of the time slot after t.  This is because j is the group carrying pack-
ets at rate r(i) during the time slot after t and this will maintain the
receiver reception rate at R(i) = R(i-1) + r(i).  If the increase signal
trigger is 1 for layer i in time slot t (in which case i < a, since the
increase signal trigger is always 0 for the group carrying rate r(A) in
time slot t), indicating the receiver can increase to layer i+1, then
the receiver joins dynamic groups j = (i+t) mod G + 1 and j' = (i+1+t)
mod G + 1.  This is because j is the group carrying packets at rate r(i)
during the time slot after t, and j' is the group carrying packets at


Luby, Vicisano, Haken                                          [Page 14]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


rate r(i+1) during the time slot after t, and this will increase the
receiver reception rate to R(i+1) = R(i-1) + r(i) + r(i+1).

Thus, at the beginning of each time slot, the receiver does at most one
leave and at most two joins.

5.3. General considerations

Generally, the multicast group addresses associated with the MRCC groups
constitute a consecutive range of multicast address space.  For example,
the 21 MRCC groups [0..20] may be bound to the SSM channel addresses
(192.35.134.26, 232.153.220.0) through (192.35.134.26, 232.153.220.20).
However, it is not a requirement that these multicast group addresses be
consecutive.  Besides the MRCC base group 0, there can be at most 128
dynamic MRCC groups associated with an MRCC session using a dynamic
layer scheme, or 129 groups in total including the base group. This is
because there are 7 bits allocated to time slot indices in the abstract
MRCC packet header, and thus there are at most 128 time slot indices
possible, and there is a one-to-one correspondence between time slot
indices and dynamic groups.

The number of MRCC groups associated with a session, and the addresses
of the multicast groups or channels bound to these groups, must be part
of the session description information communicated out of band.

6. Abstract MRCC packet header

MRCC defines the congestion control information that must be carried in
each packet header.  This information is of the same format for both the
static layer scheme and for the dynamic layer scheme.  Other information
may be required in the packet header to support the scheme that is using
MRCC to implement congestion control, but this is outside the scope of
this document.  Thus, although we refer to packets as MRCC packets,
other protocols that embed MRCC header information into packets may con-
sider the packets to be of the type of the overall protocol.  For exam-
ple, the ALC protocol instantiation [ALC00] may use MRCC for congestion
control, and the packets that ALC uses are considered to be ALC packets,
even though these packets contain MRCC header information.

The MRCC packets that contain the required congestion control informa-
tion are sent by the sender(s) to a multicast IP destination address.
In the MRCC header, all integer fields are carried in "big-endian" or
"network order", that is, most significant byte (octet) first.  Unless
otherwise noted, numeric constants are in decimal (base 10).

A MRCC packet header contains the following congestion control informa-
tion that is placed into each packet by a sender:


Luby, Vicisano, Haken                                          [Page 15]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


o Increase signal trigger (T): 1 bit

o Time slot index (TSI): 7 bits

o MRCC group number (GN): 8 bits

o Sequence number (SEQNO): 16 bits


The required congestion control information required in a MRCC packet is
depicted in Figure 1 below.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|T|    TSI      |      GN       |             SEQNO             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 1 - MRCC packet header layout

7. Fairness

A primary goal for MRCC session is to be fair to other MRCC sessions as
well as to sessions using other congestion control schemes within other
protocols such as TCP.  In particular, if several sessions are flowing
through a bottleneck link, then it is desirable for the sessions to
share the bandwidth capacity of the link fairly, and it is also desir-
able that the link not be overly congested.  A crucial variable in
determining the fair share of multiple TCP sessions flowing through a
bottleneck link is the round trip time (RTT).  In general, the smaller
the RTT for TCP the more aggressive the session will be against other
sessions, including other TCP sessions with larger RTTs.  With MRCC it
is desirable that receivers behind a common bottleneck link are joined
to the same set of groups in the session at each point in time indepen-
dent of their distance from the sender.  Thus, the role that RTT plays
in TCP with respect to fairness does not make sense for MRCC.  In this
document, the MRCC parameters are set so that a MRCC session shares
approximately equally with other MRCC sessions, and a MRCC session
shares approximately equally with a TCP session with a RTT of 200 ms.
This is assuming all sessions use the same packet length.

8. Applications

MRCC is a good choice for a congestion control scheme to use with ALC
[ALC00].  With ALC, FEC codes [FEC00] are used to provide reliability.
When using MRCC, ALC sends encoded data about an object to each group
that is part of a session.  With appropriate use of FEC codes, the data
sent to the different groups in the session, or in some cases even


Luby, Vicisano, Haken                                          [Page 16]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


multiple sessions from different senders, can be effectively used by
receivers to recover the original object.  With the ALC approach to
reliable multicast, the reliability scheme based on FEC codes can be
made to be completely separate and independent of the congestion control
scheme based on MRCC.  Furthermore, both the FEC codes described in
[FEC00] and MRCC can be used so that in the overall ALC protocol
receivers do not send packets to the sender except perhaps to request
extension of an ongoing session or to confirm complete receipt of an
object.  Thus, ALC yields a massively scalable reliable content distri-
bution protocol using an IP multicast enabled network.

MRCC is potentially also applicable to other applications.  For example,
it is possible that MRCC could be used as the congestion control scheme
for a layered media streaming application.

9. MRCC and Generic Router Assist

Router filtering of packets to assist in congestion control is described
in [LUB99].  The addresses of the multicast groups can be communicated
to the routers and they can do filtering of groups based on congestion.
This is one of the reasons why it is good to have the congestion control
information portion of the packet header in a fixed place at the begin-
ning of the packet, so that the routers can probe this field if neces-
sary (although it may not be).  A full exploration of this approach is
outside the scope of this document.

10. Intellectual Property Issues


Digital Fountain has patents pending for congestion control schemes that
may be needed to use some of the congestion control schemes described in
this document in a commercial product or service.  Digital Fountain is
willing to provide a blanket royalty free license to the rights it holds
that are needed to use the congestion control schemes described in this
document if and when these congestion control schemes become part of the
IETF standards.

11. References

[AFZ95]   Acharya, S., Franklin, M., and Zdonik, S.,
"Dissemination-Based Data Delivery Using Broadcast Disks", IEEE
Personal Communications, pp.50-60, Dec 1995.

[ALC00] Luby, M., Gemmell, J., Vicisano, L., Rizzo, L., Crowcroft, J.,
Lueckenhoff, B., "RMT: Asynchronous Layered Coding protocol
instantiation", work in progress.

[BLA94]   Blahut, R.E., "Theory and Practice of Error Control Codes",


Luby, Vicisano, Haken                                          [Page 17]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


Addison Wesley, MA 1984.

[BYE98]   Byers, J.W., Luby, M., Mitzenmacher, M., and Rege, A., "A
Digital Fountain Approach to Reliable Distribution of Bulk Data",
Proceedings ACM SIGCOMM '98, Vancouver, Canada, Sept 1998.

[BYE00] Byers, J.W., Frumin, M., Horn, G., Luby, M., Mitzenmacher, M.,
Roetter, A., "Improved Congestion Control for IP Multicast Using
Dynamic Layers", Digital Fountain research paper, June 2000.

[CAI99] Cain, B., Speakman, T., and Towsley, D., "Generic Router
Assist (GRA) Building Block, Motivation and Architecture", Internet
Draft draft-ietf-rmt-gra-arch-00.txt, a work in progress.

[DEE88]   Deering, S., "Host Extensions for IP Multicasting", RFC 1058,
Stanford University, Stanford, CA, 1988.

[FEC00] Luby, M., Vicisano, L., Rizzo, L., Gemmell, J., "RMT: FEC
codes building block", work in progress.Luby, M.,

[GEM99]   Gemmell, J., Schooler, E., and Gray, J., "FCast Scalable
Multicast File Distribution: Caching and Parameters Optimizations",
Technical Report MSR-TR-99-14, Microsoft Research, Redmond, WA, April,
1999.

[HAN98]   Handley, M., and Jacobson, V., "SDP: Session Description
Protocol", RFC 2327, April 1998.

[HAN96]   Handley, M., "SAP: Session Announcement Protocol", Internet
Draft, IETF MMUSIC Working Group, Nov 1996.

[HOL99] Holbrook, H., Cheriton, D., "IP Multicast Channels: Experss
Support for Large-scale Single-source Applications", ACM SIGCOMM'99

[HOR00] Horn, G., Luby, M., Mitzenmacher, M., "Fair Layered
Increase/Decrease Congestion Control", Digital Fountain research
paper, June 2000.

[LUB99] Luby, M., Vicisano, L., Speakman, T. "Heterogeneous multicast
congestion control based on router packet filtering", presented at RMT
meeting in Pisa, March 1999.

[R2068]   Fielding, R., Gettys, J., Mogul, J. Frystyk, H., Berners-Lee,
T., Hypertext Transfer Protocol - HTTP /1.1 (IETF RFC2068)
http://www.rfc-editor.org/rfc/rfc2068.txt

[MCC96] McCanne, S., Jacobson, V., Vetterli., M., "Receiver-driven
Layered Multicast", proc. of Sigcomm'96, August 1996, Stanford, CA.


Luby, Vicisano, Haken                                          [Page 18]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


[R2119]   Bradner, S., Key words for use in RFCs to Indicate Requirement
Levels (IETF RFC 2119) http://www.rfc-editor.org/rfc/rfc2119.txt

[RIZ97a] Rizzo, L, and Vicisano, L., "Reliable Multicast Data
Distribution protocol based on software FEC techniques", Proceedings
of the Fourth IEEES Workshop on the Architecture and Implementation of
High Performance Communication Systems, HPCS'97, Chalkidiki, Greece,
June 1997.

[RIZ97b] Rizzo, L., and Vicisano, L., "Effective Erasure Codes for
Reliable Computer Communication Protocols", ACM SIGCOMM Computer
Communication Review, Vol.27, No.2, pp.24-36, Apr 1997.

[RIZ97c] Rizzo, L., "On the Feasibility of Software FEC", DEIT Tech
Report, http://www.iet.unipi.it/~luigi/softfec.ps, Jan 1997.

[RUB99]   Rubenstein, Dan, Kurose, Jim and Towsley, Don, "The Impact of
Multicast Layering on Network Fairness", Proceedings of ACM SIGCOMM'99.

[VIC98A] L.Vicisano, L.Rizzo, J.Crowcroft, "TCP-like Congestion
Control for Layered Multicast Data Transfer", IEEE Infocom '98, San
Francisco, CA, Mar.28-Apr.1 1998.

[VIC98B] Vicisano, L., "Notes On a Cumulative Layered Organization of
Data Packets Across Multiple Streams with Different Rates", University
College London Computer Science Research Note RN/98/25, Work in
Progress (May 1998).  .fi

Authors' Addresses

   Michael Luby
   luby@digitalfountain.com
   Digital Fountain
   600 Alabama Street
   San Francisco, CA, USA, 94110

   Lorenzo Vicisano
   lorenzo@cisco.com
   cisco Systems, Inc.
   170 West Tasman Dr.,
   San Jose, CA, USA, 95134

   Armin Haken
   armin@digitalfountain.com
   Digital Fountain
   600 Alabama Street
   San Francisco, CA, USA, 94110


Luby, Vicisano, Haken                                          [Page 19]

Internet Draft    RMT BB, Multirate Congestion Control         June 2000


Luby, Vicisano, Haken                                          [Page 20]