Internet Engineering Task Force                                   RMT WG
INTERNET-DRAFT                                            Adamson/Macker
draft-macker-rmt-mdp-00.txt                                  Newlink/NRL
                                                          22 October 1999
                                                        Expires: Apr 2000


                The Multicast Dissemination Protocol (MDP)

Status of this Memo

      This document is an Internet-Draft and is in full conformance with
      all provisions of Section 10 of RFC2026.

      Internet-Drafts are working documents of the Internet Engineering
      Task Force (IETF), its areas, and its working groups.  Note that
      other groups may also distribute working documents as Internet-
      Drafts.

      Internet-Drafts are draft documents valid for a maximum of six
      months and may be updated, replaced, or obsoleted by other docu-
      ments at any time.  It is inappropriate to use Internet-Drafts as
      reference material or to cite them other than as "work in
      progress."

      The list of current Internet-Drafts can be accessed at
      http://www.ietf.org/ietf/1id-abstracts.txt

      The list of Internet-Draft Shadow Directories can be accessed at
      http://www.ietf.org/shadow.html.

      Copyright Notice

      Copyright (C) The Internet Society (1999).  All Rights Reserved.

Abstract

      The Multicast Dissemination Protocol (MDP) is a protocol framework
      designed to provide reliable multicast data and file delivery ser-
      vices on top of the generic UDP/IP multicast transport [1].  MDP is
      well suited for reliable multicast bulk transfer of data across a
      heterogeneous internetwork.  Further enhancements made to the pro-
      tocol are suitable for a range of network environments, including
      wireless internetwork environments.  At its core, MDP is an effi-
      cient negative acknowledgement (NACK) based reliable multicast pro-
      tocol that leverages erasure-based coding in ways to improve  pro-
      tocol efficiency and robustness.  MDP also includes an optional
      adaptive end-to-end rate-based congestion control mode that is to


Adamson, Macker            Expires April 2000                   [Page 1]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      operate with competing flows (e.g., TCP sessions or other conges-
      tion aware flows).  This document describes the protocol messages,
      building blocks, general operation, and optional modes of the pre-
      sent MDP instantiation and implementation.

1.0 Background

      This document describes the Multicast Dissemination Protocol (MDP)
      a protocol framework for reliable multicasting data that is espe-
      cially suitable  for efficient bulk data transfer.  MDP is expected
      to meet many of the critera described in [2].  The core MDP frame-
      work makes no design  assumptions about network structure, hierar-
      chy, or reciprocal routing paths.  The techniques and building
      blocks utilized in MDP are directly applicable to "flat" multicast
      groups but could be applied to a given level of a hierarchical
      (e.g.  tree-based) multicast distribution system if so desired.
      Working MDP applications have been demonstrated across a range of
      network architecture and heterogeneous  conditions including; the
      worldwide Internet MBone, bandwidth and routing  asymmetries,
      satellite networks, and mobile wireless networks.

      Previous work on an earlier MDP design was implemented as part of
      the freely available Image Multicaster (IMM) reliable multicast
      application used and tested over the Internet Multicast Backbone
      (Mbone) since 1993 [3].  This document describes a more recent
      design and implementation of the  Multicast Dissemination Protocol
      (MDP).  The authors intend the present design  to replace previous
      MDP work and this paper herein only references and discusses recent
      design work.

2.0 Protocol Motivation and Design Overview

      MDP provides end-to-end reliable transport of data over IP multi-
      cast capable networks. The primary design goal of MDP is to provide
      an efficient, scalable, and robust bulk data (e.g. computer files,
      transmission of persistent data) transfer capability adaptable
      across heterogeneous networks and topologies.  MDP provides a num-
      ber of different reliable multicast services and modes of operation
      as described in different parts of this document.  The goal of this
      flexible approach is to provide a useful set of reliable multicast
      building blocks.  In addition, while the current capabilities of
      MDP focus on meeting specific bulk data and limited messaging
      transfer requirements, the MDP framework is envisioned to be
      extended to meet additional requirements in the future.

      The following factors were important considerations in the MDP
      design:


Adamson, Macker            Expires April 2000                   [Page 2]


Internet Draft    The Multicast Dissemination Protocol      October 1999


        1) Heterogeneous, WAN-based networking operation
        2) Minimal assumption of network structure for general operation
        3) Operation over wide range of topologies and network link rates
        3) Efficient asymmetric operation
        4) Low protocol overhead and minimal receiver feedback
        5) Potential use in large group sizes
        6) Loose timing constraints and minimal group coordination
        7) Dynamic group session support (members may leave and join)

      The current MDP protocol employs a form of parity-based repair
      using packet-level forward error correction coding techniques simi-
      lar to basic concept described in [4].  The use of parity-based
      repair for multicast reliability offers significant performance
      advantages in the case of uncorrelated packet loss among receivers
      (such as in a  broadcast wireless environments or WAN distribu-
      tion)[5, 6].  The technique can also be leveraged to increase the
      effectiveness of receiver-based feedback suppression.  These
      encoded parity packets are generally sent only "on demand" in
      response to repair requests the receiver group.  However, the pro-
      tocol can be optionally configured to transmit some portion of
      repair packets proactively to potentially increase protocol perfor-
      mance (throughput and/or delay) in certain conditions (e.g. "a
      priori" expected group loss, long delays, some asymmetric network
      conditions, etc) [7].

      Another aspect of the MDP protocol design is providing support for
      distributed multicast session participation with minimal coordina-
      tion among sources and receivers.  The protocol allows sources and
      receivers to dynamically join and leave multicast sessions at will
      with minimal overhead for control information and timing synchro-
      nization among participants.  To accommodate this capability, MDP
      protocol message headers contain some common information allowing
      receivers to easily synchronize to sources throughout the lifetime
      of a defined session.  These common headers also include support
      for collection of transmission timing information (e.g., round trip
      delays) that allows MDP to adapt itself to a wide range of dynamic
      network conditions with little or no pre-configuration.  The proto-
      col was purposely designed to be tolerant of inaccurate timing
      estimations or lossy conditions which may occur in mobile and wire-
      less networks.  The protocol is also designed to exhibit conver-
      gence even under cases of heavy packet loss and large queueing or
      transmission delays.

      Scalability concerns in data multicasting have lead to a general
      increased interest in and adaptation of negative acknowledgement
      (NACK) based protocol schemes [8].  MDP is a protocol centered
      around the use of selective NACKs to request repairs of missing
      data.  MDP also uses NACK suppression methods and dynamic event


Adamson, Macker            Expires April 2000                   [Page 3]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      timers to reduce retransmission requests and avoid congestion
      within the network.  When used in pure multicast session operation,
      both NACKs and repair transmissions are multicast to the group to
      aid in feedback and control message suppression.  This feature and
      additional message aggregation  functionality reduce the likelihood
      of multicast control message implosion.  MDP also dynamically col-
      lects group timing information and uses it to further improve its
      data delivery efficiency in terms of latency, overhead, and minimal
      redundant transmissions.

      In summary, the MDP design goals were to create a scalable reliable
      multicast transport protocol capable of operating in heterogeneous
      and possibly mobile internetwork environments.  The capability of
      fully distributed operation with minimal precoordination among the
      group including the ability for participants to join and leave at
      any time was also an important consideration.  MDP is intended to
      be suitable primarily for bulk data and file transfer with eventual
      support for streaming and other group data transport paradigms.

      While the various features of MDP are designed to provide some mea-
      sure of general purpose utility, we wish here to reemphasize the
      importance of understanding that "no one size fits all" in the
      reliable multicast transport arena.  There are numerous engineering
      tradeoffs involved in reliable multicast transport design requiring
      increased application and network architecture considerations.
      Some performance requirements affecting design include:  group
      size, heterogeneity (e.g., capacity and/or delay), asymmetric
      delivery, data ordering, delivery delay, group dynamics, mobility,
      congestion control, and transport across low capacity connections.
      MDP contains various options to accommodate many of these differing
      requirements.  However, MDP is intended to work most efficiently as
      a reliable multicast bulk data transfer protocol in environments
      where protocol overhead and heterogeneity is a main concern.
      Likely application areas include mobile wireless, asymmetric satel-
      lite, and heterogeneous WAN conditions.  MDP's most general mode of
      operation assumes little or no structure in the network architec-
      ture and works in an end-to-end fashion.  This does not preclude
      the adaptation of the protocol to more structured applications
      (e.g., reliable multicast hierarchy, addition of local repair mech-
      anisms, sessions spanning multiple groups, etc).

3.0 MDP Protocol Definition

3.1 Assumptions

      An MDP protocol "session" instantiation (MdpSession) is defined by
      participants communicating User Datagram Protocol (UDP) packets
      over an Internet Protocol (IP) network on a common, pre-determined


Adamson, Macker            Expires April 2000                   [Page 4]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      network address and host port number.  Generally, the participants
      exchange packets on an IP multicast group address, but unicast
      transport may also be established or applied as an adjunct to mul-
      ticast delivery. Currently the protocol uses a single multicast
      address for transmissions associated with a given MDP session.
      However, in the future, multiple multicast addresses might be
      employed to segregate separate degrees of repair information to
      different groups of receivers experiencing different packet loss
      characteristics with respect to a given source.  This capability is
      under ongoing investigation.  Also the protocol supports asymmetry
      where receiver participants may transmit back to source partici-
      pants via unicast routing instead of broadcasting to the session
      multicast address.

      Each participant (MdpNode) within an MdpSession is assumed to have
      an preselected unique 32-bit identifier (MdpNodeId).  Source MdpN-
      odes MUST have uniquely assigned identifiers within a single
      MdpSession to distinquish multiple sources.  Receivers SHOULD have
      unique identifiers to avoid certain protocol inefficiencies that
      may occur, particularly when operating with congestion control
      modes enabled or when using MDP's optional positive acknowledgement
      feature.  The protocol does not preclude multiple source nodes
      actively transmitting within the context of a single MDP session
      (i.e. many- to-many operation), but any type of interactive coordi-
      nation among these sources is assumed to be controlled at a higher
      protocol layer.

      Unique data content transmitted within an MdpSession uses source-
      specific identifiers (MdpObjectTransportId) which are valid and
      applicable only during the actual _transport_ of the particular
      portion of data content.  Any globally unique identification of
      transported data content must be assigned and processed by the
      higher level application using the MDP transport service.

      [ASIDE:   It is anticipated that if, in the future, MDP is extended
                to support local repair mechanisms that the application
                will interact with the MDP transport service to process
                any _global_ data identifiers which may be required to
                support such operation.  The same is true of possible
                extensions to MDP to support operation as part of a more
                highly structured (e.g. tree-based) data dissemination
                system.  There are also numerous other supporting capa-
                bilities that may be required to implement certain types
                of multicast applications.  While it does not currently
                specifically support them, MDP could play a role in the
                provision of these services in addition to its role of
                data transport.  Examples of such capabilities include
                security (key distribution, group authentication) or


Adamson, Macker            Expires April 2000                   [Page 5]


Internet Draft    The Multicast Dissemination Protocol      October 1999


                multicast session management.]

3.2 General MDP Source and Receiver Messaging and Interaction

      An MDP source primarily generates messages of type MDP_DATA and
      MDP_PARITY which carry the data content and related parity-based
      repair information for the bulk data (or file) objects being trans-
      ferred.  The MDP_PARITY information is by default sent only on
      demand thus normally requiring no additional protocol overhead. The
      transport of an object can be optionally configured to proactively
      transmit some amount of MDP_PARITY messages with the original MDP
      data blocks to potentially enhance performance (e.g., improved
      delay).  This configuration MAY be sensible for certain network
      conditions or can also allow for robust, asymmetric multicast
      (e.g., unidirectional routing, satellite, cable).  A source message
      of type MDP_INFO is also defined and is used to carry any optional
      "out-of-band" context information for a given transport object.
      The content of MDP_INFO messages is repaired with a lower delay
      process than general encoded data and thus may serve special pur-
      poses in a reliable multicast application.  The source also gener-
      ates messages of type MDP_CMD to perform certain protocol opera-
      tions such as congestion control probing, end-of-transmission
      flushing, round trip time estimation, optional positive acknowl-
      edgement requests, and "squelch" commands to indicate to requesting
      receivers the non-availability of previously-available or obsolete
      data.

      An MDP receiver generates messages of type MDP_NACK or MDP_ACK in
      response to transmissions of data and commands from a source.  The
      MDP_NACK messages are generated to request repair of detected data
      transmission losses.  Receivers generally detect losses by tracking
      the sequence of transmission from a source.  Sequencing information
      is embedded in the transmitted data packets and end-of-transmission
      commands from the source.  MDP_ACK messages are generated in
      response to certain commands transmitted by the source.  In the
      general (and most scalable) protocol mode, receivers do not trans-
      mit any MDP_ACK messages.  However, in order to meet potential user
      requirements for positive data acknowledgement, and to collect more
      detailed information for potential multicast congestion control
      algorithms, MDP_ACK messages are defined and potentially used.
      MDP_ACK messages are also generated by a small subset of receivers
      when MDP dynamic end-to-end congestion control is in operation.

      In addition to the messages described above, the protocol defines
      an optional MDP_REPORT message that is periodically transmitted by
      all source and receiver nodes.  The MDP_REPORT message contains
      some additional session level information such as a string identi-
      fying the nodes "name" and provides a mechanism for collecting


Adamson, Macker            Expires April 2000                   [Page 6]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      group statistics on protocol operation.  The MDP_REPORT messages
      are not critical for operation except during use of a current
      experimental feature which allows for automated formation of subset
      receiver groups participating in positive acknowledgement of data
      transmissions.  In the future, the content of MDP_REPORT messages
      may be determined by the application, but currently the MDP_REPORT
      message content is currently fixed to contain performance statis-
      tics reporting.

      The current definition of MDP allows for reliable transfer of two
      different types of data content.  These include the type
      MDP_OBJECT_DATA which are static, persistent blocks of data content
      maintained in the source's application memory storage and the type
      MDP_OBJECT_FILE which corresponds to data stored in the source's
      non-volatile file system.  Both of these current types represent
      "MdpObjects" of finite size which are encapsulated for transmission
      as and are temporarily yet uniquely identified with the given
      source's MdpNodeId and a temporarily unique MdpObjectTransportId.

      All transmissions by individual sources and receivers are subject
      to rate control governed by a peak transmission rate set for each
      participant by the application.  This can be used to limit the
      quantity of multicast data transmitted by the group.  When MDP's
      congestion control algorithm is enabled the rate for sources is
      automatically adjusted.  And even when congestion control is
      enabled, it may be desirable in some cases to establish minimum and
      maximum bounds for the rate adjustment depending upon the applica-
      tion.

      [ASIDE:   The protocol has been designed with future support envi-
                sioned for  data content of type MDP_OBJECT_STREAM which
                will correspond to a unbounded "stream" of messages
                (small and/or large in size).  Although this behavior can
                be emulated with transmission of a series of MdpObjects
                of type MDP_OBJECT_DATA, there are additional protocol
                efficiencies which can be realized with true "stream"
                support.  In the long term, the use of the
                MDP_OBJECT_STREAM type may supplant the other object
                types for most applications.  This document will be
                updated when that design is complete.]

3.3 Message Type and Header Definitions

      This section describes the message formats used in MDP.  Note that
      these messages do not currently adhere to any particular machine
      alignment methodology.  During development of this protocol design,
      message field alignment has not been explicitly addressed.  So it
      is likely that some optimization of the protocol message alignment


Adamson, Macker            Expires April 2000                   [Page 7]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      resulting in changes to the message formats will occur in the
      future.  Therefore, please note that the message formats presented
      here represent the current experimental implementation and are doc-
      umented here for purposes of describing the fields' functionality.
      The field values are presented in standard network byte order (Big
      Endian) for those fields greater than one byte (8 bits) in length.

3.3.1 MDP Common Message Header

      All MDP protocol messages begin with a common header with informa-
      tion fields as follows:


          +--------+---------------+--------------------------------+
          | Field  | Length (bits) |            Purpose             |
          +--------+---------------+--------------------------------+
          |type    |       8       | MDP message type               |
          +--------+---------------+--------------------------------+
          |version |       8       | Protocol version number        |
          +--------+---------------+--------------------------------+
          |node_id |      32       | Message originator's MdpNodeId |
          +--------+---------------+--------------------------------+

      The message "type" field is an 8-bit value indicating the MDP pro-
      tocol message type.  These types are defined as follows:


                             Message Type   Value

                             MDP_REPORT       1
                             MDP_INFO         2
                             MDP_DATA         3
                             MDP_PARITY       4
                             MDP_CMD          5
                             MDP_NACK         6
                             MDP_ACK          7


      The "version" field is an 8-bit value indicating the protocol ver-
      sion number.  Currently, MDP implementations SHOULD ignore received
      messages with a different protocol version number. This number is
      intended to indicate and distinguish upgrades of the protocol that
      may be non-interoperable.

      The "node_id" is a 32-bit value uniquely identifying the source of
      the message.  A participant's MDP node identifier (MdpNodeId) can
      be set according to the application needs but unique identifiers
      must be assigned within a single MdpSession.  In many cases, use of


Adamson, Macker            Expires April 2000                   [Page 8]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      the host IP address can suffice, but in some cases alternative
      methodologies for assignment of unique node identifiers within a
      multicast session may need to be considered.  For example, he
      "source identifier" mechanism defined in the RTPv2 specification
      [9] may be applicable to use for MDP node identifiers.  At this
      point in time, the protocol makes no assumptions about how these
      unique identifiers area actually assigned.

   3.3.2 MDP_REPORT Message

      The MDP_REPORT message is used to report status information to
      other session participants.  This report currently includes node
      name information for the purposes of potential bookkeeping, the
      reporting address identifier, and a number of statistics relating
      to the present source session.  A report includes the following
      information and estimates based upon the reporting node1s local
      activity:  duration of session participation, successful transfers,
      pending transfers, failed transfers, source re-syncs, block loss
      statistics, transmit rate, transmitted NACKs, suppressed NACKs,
      buffer utilization, average goodput, receiver rate.

      The MDP_REPORT message is not required for protocol operation, but
      provides useful periodic feedback for protocol debugging, perfor-
      mance monitoring, and statistical estimation.  In addition to the
      common header, the MDP_REPORT message contains the following
      fields:


           +--------+---------------+-------------------------------+
           | Field  | Length (bits) |            Purpose            |
           +--------+---------------+-------------------------------+
           |status  |       8       | Reporting node's status flags |
           +--------+---------------+-------------------------------+
           |flavor  |       8       | Type of MDP_REPORT message    |
           +--------+---------------+-------------------------------+
           |content |      --       | Flavor-dependent content      |
           +--------+---------------+-------------------------------+

      The "status" field contains flags indicating the sending node's
      current operating mode.  The flags currently defined include:


Adamson, Macker            Expires April 2000                   [Page 9]


Internet Draft    The Multicast Dissemination Protocol      October 1999


        +-----------+-------+------------------------------------------+
        |   Flag    | Value |                 Purpose                  |
        +-----------+-------+------------------------------------------+
        |MDP_CLIENT | 0x01  | Node is participating as a client        |
        |           |       | (receiver)                               |
        +-----------+-------+------------------------------------------+
        |MDP_SERVER | 0x02  | Node is participating as a server        |
        |           |       | (source)                                 |
        +-----------+-------+------------------------------------------+
        |MDP_ACKING | 0x04  | Node wishes to provide positive acknowl- |
        |           |       | edgements                                |
        +-----------+-------+------------------------------------------+

      The MDP_CLIENT and MDP_SERVER flags indicate the reporting node's
      levels of participation in the the corresponding MdpSession.  The
      MDP_ACKING flag is set by reporting nodes wishing to participate in
      positive acknowledgement cycles.  This is not a robust mechanism
      for forming positive acknowledgement receiver subsets and is pro-
      vided for experimental purposes.

      The "flavor" field indicates the type of MDP_REPORT message.  Cur-
      rently, only one type of MDP_REPORT is defined: MDP_REPORT_HELLO,
      value = 1.  This report contains a string with the "name" of the
      reporting MdpNode and a detailed periodic reception statistics
      report.

      (TBD) The contents of "client_stats" field will be described in the
      future. (In the present implementation, this field includes trans-
      mission/reception data rates, object transfer success/failures,
      goodput measurement, buffer utilization/overrun report, loss rate
      histogram, etc.)

3.3.3 MDP_INFO Message

      The object information message is used by sources to announce a
      small amount of "out-of-band" information regarding an object in
      transport.  The information content must fit within the source's
      current "maximum segment_size" setting.  Since the MDP_INFO content
      is all contained within a single MDP message, it allows for shorter
      turn-around time for receivers to "NACK" for the information, and
      for the source to subsequently provide a repair retransmission
      (NACK aggregation is greatly simplified under this condition).

      There are several uses envisioned for the MDP_INFO content in gen-
      eral multicast applications.  For example, the MDP_INFO packets may
      be useful for _global_ object identifiers used within as part of an
      SRM-like local repair protocol [10] embedded with MDP. MDP_INFO may
      also be useful for reliable multicast session management purposes.


Adamson, Macker            Expires April 2000                  [Page 10]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      Additionally, when MDP_OBJECT_STREAM objects are introduced, the
      attached MDP_INFO may be useful for providing context information
      (MIME-type info, etc) on the corresponding stream.  Note that the
      availability of MDP_INFO for a given object is optional.  A flag in
      the header of MDP_DATA and MDP_PARITY packets indicates the avail-
      ability of MDP_INFO for a given transport object.  In addition to
      the MDP common message header, MDP_INFO messages contain the fol-
      lowing fields:


      +---------------+---------------+----------------------------------+
      |    Field      | Length (bits) |             Purpose              |
      +---------------+---------------+----------------------------------+
      | sequence      |      16       | Packet loss detection sequence   |
      |               |               | number                           |
      +---------------+---------------+----------------------------------+
      | object_id     |      32       | MdpObjectTransportId identifier  |
      +---------------+---------------+----------------------------------+
      | object_size   |      32       | Size of object (in bytes)        |
      +---------------+---------------+----------------------------------+
      | ndata         |       8       | Source's FEC data block size     |
      +---------------+---------------+----------------------------------+
      | nparity       |       8       | Maximum available parity per     |
      |               |               | block                            |
      +---------------+---------------+----------------------------------+
      | flags         |       8       | Object transmission flags        |
      +---------------+---------------+----------------------------------+
      | grtt          |       8       | Quantized current source GRTT    |
      |               |               | estimate                         |
      +---------------+---------------+----------------------------------+
      | segment_size  |      16       | Source maximum segment payload   |
      |               |               | (bytes)                          |
      +---------------+---------------+----------------------------------+
      | data          |      --       | Info content (up to "seg-        |
      |               |               | ment_size" bytes)                |
      +---------------+---------------+----------------------------------+

      The "sequence" field is used by MDP receivers for calculating a
      running estimate of packet loss for feedback to the MDP source in
      support of MDP's automatic congestion control technique.  The
      16-bit sequence number increases monotonically with each packet
      transmitted by an MDP source and rolls over when the maximum is
      reached.  This sequence number increases independently of specific
      MdpObject transmission or repair.

      The "object_id" field is a monotonically and incrementally increas-
      ing value assigned by a source to the object being transmitted.
      Transmissions and repair requests related to that object use the


Adamson, Macker            Expires April 2000                  [Page 11]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      same "object_id" value.  For sessions of very long duration, the
      "object_id" field may be repeated, but it is presumed that the
      32-bit field size provides an adequate enough sequence space to
      prevent temporary object confusion amongst receivers and sources
      (i.e.  receivers SHOULD re-synchronize with a server when receiving
      object sequence identifiers sufficiently out-of-range with the cur-
      rent state kept for a given source).  During the course of trans-
      mission within an MDP session, an object is uniquely identified by
      the concatenation of the source "node_id" and the given
      "object_id".

      The "object_size" field indicates the size of the given transport
      object in bytes.  Note that as MDP is extended to include "stream"
      objects of indeterminate length, a corresponding flag in the flags
      field will indicate the non-validity of the object_size field (or
      it may assume another use, e.g.  additional sequencing informa-
      tion).

      The "ndata" and "nparity" fields are used by the source to adver-
      tise its current FEC encoding parameters, the number of MDP_DATA
      segments per coding block and number of available MDP_PARITY seg-
      ments for repair per block, respectively.

      The "flags" field is used to advertise information about current
      object transmission status.  Defined flags currently include:


Adamson, Macker            Expires April 2000                  [Page 12]


Internet Draft    The Multicast Dissemination Protocol      October 1999


    +--------------------+-------+------------------------------------------+
    |       Flag         | Value |                 Purpose                  |
    +--------------------+-------+------------------------------------------+
    |MDP_FLAG_REPAIR     | 0x01  | Indicates message is a repair transmis-  |
    |                    |       | sion                                     |
    +--------------------+-------+------------------------------------------+
    |MDP_FLAG_BLOCK_END  | 0x02  | Indicates end of coding block transmis-  |
    |                    |       | sion                                     |
    +--------------------+-------+------------------------------------------+
    |MDP_FLAG_RUNT       | 0x04  | Indicates message size is less than seg- |
    |                    |       | ment_size (applies to MDP_DATA messages  |
    |                    |       | only)                                    |
    +--------------------+-------+------------------------------------------+
    |MDP_FLAG_INFO       | 0x10  | Indicates availability of MDP_INFO for   |
    |                    |       | object                                   |
    +--------------------+-------+------------------------------------------+
    |MDP_FLAG_UNRELIABLE | 0x20  | Indicates that repair transmissions for  |
    |                    |       | the specified object will be unavail-    |
    |                    |       | able. (One-shot, best effort transmis-   |
    |                    |       | sion)                                    |
    +--------------------+-------+------------------------------------------+
    |MDP_FLAG_FILE       | 0x80  | Indicates object is "file-based" data    |
    |                    |       | (hint to use disk storage for reception) |
    +--------------------+-------+------------------------------------------+

      The "grtt" field contains a quantized representation of the source-
      based current estimate of greatest round trip transmission delay
      time for the group.  The value is in units of microseconds and is
      quantized using the following C function:

           unsigned char QuantizeGrtt(double grtt)
           {
                if (grtt > 1.0e03)
                     grtt = 1.0e03;
                else if (grtt < 1.0e-06)
                     grtt = 1.0e-06;
                if (grtt < 3.3e-05)
                     return ((unsigned char)(grtt * 1.0e06) - 1);
                else
                     return ((unsigned char)(ceil(255.0.-
                                  (13.0 * log(1.0e03/grtt)))));
           }

      Note that this function is useful for quantizing GRTT times in the
      range of 1 microsecond to 1000 seconds.  MDP implementations may
      wish to further constrain GRTT estimates for practical reasons.

      The "segment_size" field indicates the source's current setting for


Adamson, Macker            Expires April 2000                  [Page 13]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      maximum message payload content (in bytes).  Knowledge of this
      value allows an MDP receiver to allocate appropriate buffering
      resources.

      The "data" field of the MDP_INFO packet contains the information
      set for this object by the source MDP application.  MdpObjects of
      type MDP_OBJECT_FILE use this field for file name information.  An
      application may use this field for its own purposes for MdpObjects
      of type MDP_OBJECT_DATA.  Furthermore, it is possible that data
      content of one "segment_size" or less may be entirely represented
      by a single MDP_INFO packet.  The advantage of this approach for
      small messaging purposes is a more rapid repair retransmission
      cycle.  The disadvantage is that FEC-based repair is not available
      for MDP_INFO messages.  A number of uses for this special message
      type are anticipated in potential applications of MDP.  The defini-
      tion and use of MDP_INFO as applied to objects of type
      MDP_OBJECT_FILE, MDP_OBJECT_DATA, and MDP_OBJECT_STREAM will be
      further refined in the future.

3.3.4 MDP_DATA

      This MDP_DATA message is used for carrying multicast user object
      data content within a session.  In addition to the common header,
      it includes the following fields:


Adamson, Macker            Expires April 2000                  [Page 14]


Internet Draft    The Multicast Dissemination Protocol      October 1999


     +----------------+---------------+----------------------------------+
     |     Field      | Length (bits) |             Purpose              |
     +----------------+---------------+----------------------------------+
     | sequence       |      16       | Packet loss detection sequence   |
     |                |               | number                           |
     +----------------+---------------+----------------------------------+
     | object_id      |      32       | MdpObjectTransportId identifier  |
     +----------------+---------------+----------------------------------+
     | object_size    |      32       | Size of object (in bytes)        |
     +----------------+---------------+----------------------------------+
     | ndata          |       8       | Source's FEC data block size     |
     +----------------+---------------+----------------------------------+
     | nparity        |       8       | Maximum available parity per     |
     |                |               | block                            |
     +----------------+---------------+----------------------------------+
     | flags          |       8       | Object transmission flags        |
     +----------------+---------------+----------------------------------+
     | grtt           |       8       | Quantized current source GRTT    |
     |                |               | estimate                         |
     +----------------+---------------+----------------------------------+
     | offset         |      32       | Data content's "offset" within   |
     |                |               | object                           |
     +----------------+---------------+----------------------------------+
     | segment_size*  |      16       | Source maximum segment payload   |
     |                |               | (bytes)                          |
     +----------------+---------------+----------------------------------+
     | data           |      --       | Data content (up to "seg-        |
     |                |               | ment_size" bytes)                |
     +----------------+---------------+----------------------------------+

           *The "segment_size" field is only present in MDP_DATA packets
           which are less than the source's current "segment_size" set-
           ting (i.e. for the last ordinal segment of an object).

      Note that many of the fields and their use are the same as for the
      MDP_INFO message type.  Receivers can synchronize to sources and
      begin receiving reliable multicast content upon the reception of
      MDP_INFO, MDP_DATA, or MDP_PARITY.  Some provision is made in the
      present protocol implementation to prevent dynamically joining
      receivers from significantly slowing the forward progress of an
      ongoing source session.  This will be discussed in further detail
      later.  There are only three fields in the MDP_DATA message which
      differ from the MDP_INFO message.

      The "offset" field is provided to indicate the position (in bytes)
      of the MDP_DATA message's data content with respect to the begin-
      ning of the object (offset zero).  For example, for a file object,
      this corresponds to the "seek" offset, and for static data objects,


Adamson, Macker            Expires April 2000                  [Page 15]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      this corresponds to the offset from the base pointer to memory
      storage.

      For MDP_DATA messages, the "segment_size" field and only required
      for payloads shorter than the segment size.  The MDP_FLAG_RUNT flag
      in the "flags" field indicates the presence of a segment_size indi-
      cator.  For other (non-runt) packets, the source's  "segment_size"
      setting can be implicitly determined from the size of the received
      message.  The MDP_DATA data field simply contains the data content
      for the indicated portion of the associated transport object.

2.3.5 MDP_PARITY

      The MDP_PARITY message is used for parity-based repair messages.
      It is similar to the MDP_DATA message.  In addition to the common
      header it includes following fields:


      +--------------+---------------+----------------------------------+
      |    Field     | Length (bits) |          Purpose                 |
      +--------------+---------------+----------------------------------+
      | sequence     |      16       | Packet loss detection sequence   |
      |              |               | number                           |
      +--------------+---------------+----------------------------------+
      | object_id    |      32       | Object transport identifier      |
      +--------------+---------------+----------------------------------+
      | object_size  |      32       | Size of object (in bytes)        |
      +--------------+---------------+----------------------------------+
      | ndata        |       8       | Source FEC block size            |
      +--------------+---------------+----------------------------------+
      | nparity      |       8       | Maximum available parity         |
      +--------------+---------------+----------------------------------+
      | flags        |       8       | Object transmission flags        |
      +--------------+---------------+----------------------------------+
      | grtt         |       8       | Quantized current source GRTT    |
      |              |               | estimate                         |
      +--------------+---------------+----------------------------------+
      | offset       |      32       | "Offset" of applicable coding    |
      |              |               | block                            |
      +--------------+---------------+----------------------------------+
      | parity_id    |       8       | Parity segment id                |
      +--------------+---------------+----------------------------------+
      | data         |      --       | Parity content ("segment_size"   |
      |              |               | bytes)                           |
      +--------------+---------------+----------------------------------+

      With the exception of the occasional need for the "segment_size"
      field, a slightly different use of the "offset" field, and the


Adamson, Macker            Expires April 2000                  [Page 16]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      presence of the "parity_id" field, all of the fields in the
      MDP_PARITY message are the same and have the same use as the corre-
      sponding fields described for the MDP_DATA message type.

      The source's "segment_size" setting can always be implicitly deter-
      mined from MDP_PARITY messages since the parity payload is always
      of "segment_size" bytes.

      The "offset" field is used to indicate the offset of the first data
      segment of the FEC coding block for which the parity repair message
      has been calculated.

      The "parity_id" field is used to indicate the position within the
      source's FEC coding block of the parity segment content contained
      in the message.  Note that it will always be a non-zero value since
      any valid coding block will always have at least one segment of
      data content.

      MDP could make use of other erasure-based coding schemes, but
      presently the implementation we are describing uses Reed-Solomon
      coding [11] and the description  will be limited to application of
      that coding approach. The "data" field  contains the Reed-Solomon
      parity information for the coding block position as indicated by
      the "parity_id" field.  These parity packets are calculated using
      an 8-bit word size Reed-Solomon forward error correction code with
      each byte of the message corresponding to the same byte position
      for the associated coding block data content (i.e. code blocks are
      "striped" over the payload content portion of MDP_DATA and MDP_PAR-
      ITY messages).  For MdpObjects whose last block is shortened
      because the object size is not an even multiple of the coding block
      size ("ndata") and source "segment_size", zero-value padding is
      assumed for short (runt) data messages and the Reed-Solomon encod-
      ing is further shortened for the last data coding block according
      to the object's size.

3.3.6 MDP_CMD

      MDP_CMD messages are generated by sources within a session to ini-
      tiate or respond to various protocol actions.  Different MDP_CMD
      types are identified by an 8-bit "flavor" field in each MDP_CMD
      messages.  The size and content of MDP_CMD messages vary depending
      upon their type.  The following source command types are currently
      defined:


Adamson, Macker            Expires April 2000                  [Page 17]


Internet Draft    The Multicast Dissemination Protocol      October 1999


        +-----------------+--------+----------------------------------+
        |    Command      | Flavor |            Purpose               |
        +-----------------+--------+----------------------------------+
        |MDP_CMD_FLUSH    |   1    | Indicates source temporary or    |
        |                 |        | permanent end-of-transmission    |
        |                 |        | cycle.  (Can assist in robustly  |
        |                 |        | initiating NACK repair requests  |
        |                 |        | from receivers).                 |
        +-----------------+--------+----------------------------------+
        |MDP_CMD_SQUELCH  |   2    | Indicates obsolete object for    |
        |                 |        | which a NACK has been received.  |
        +-----------------+--------+----------------------------------+
        |MDP_CMD_ACK_REQ  |   3    | Requests positive acknowledge-   |
        |                 |        | ment of a specific object from a |
        |                 |        | specific list of receivers.      |
        +-----------------+--------+----------------------------------+
        |MDP_CMD_GRTT_REQ |   4    | Probe used in collection of      |
        |                 |        | source's group GRTT estimate and |
        |                 |        | congestion control feedback.     |
        +-----------------+--------+----------------------------------+

3.3.6.1 MDP_CMD_FLUSH

      The MDP_CMD_FLUSH command type is used when an MDP source has com-
      pleted transmission of all data it has pending to send and contains
      the following fields in addition to the MDP common message header:


       +------------+---------------+----------------------------------+
       |   Field    | Length (bits) |         Purpose                  |
       +------------+---------------+----------------------------------+
       | sequence   |      16       | Packet loss detection sequence   |
       |            |               | number                           |
       +------------+---------------+----------------------------------+
       | flavor     |       8       | MDP_CMD type (value = 1)         |
       +------------+---------------+----------------------------------+
       | object_id  |      32       | MdpObjectTransportId of most     |
       |            |               | recent MdpObject for which the   |
       |            |               | source has completed transmis-   |
       |            |               | sion                             |
       +------------+---------------+----------------------------------+

      The "sequence" and "flavor" fields serve the purposes previously
      described.  The "object_id" field indicates the last MdpObject for
      which the source completed transmission.  This allows this message
      to initiate repair requests from any receivers with missing data
      content or completely missing MdpObjects.  The process by which the
      source uses this message to "flush" the receiver set for repairs is


Adamson, Macker            Expires April 2000                  [Page 18]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      described later in this document.

3.3.6.2 MDP_CMD_SQUELCH

      The MDP_CMD_SQUELCH command type is used by the source in response
      to a repair request for invalid data.  A receiver might make such a
      request after a severe network outage and it allows source applica-
      tions to "dequeue" data which the application no longer wishes to
      provide.  In either case, receivers should stop requesting repair
      of MdpObjects for which MDP_CMD_SQUELCH commands are received.  The
      MDP_CMD_SQUELCH contains the following fields in addition to the
      MDP common message header:


       +------------+---------------+----------------------------------+
       |   Field    | Length (bits) |             Purpose              |
       +------------+---------------+----------------------------------+
       | sequence   |      16       | Packet loss detection sequence   |
       |            |               | number T} flavor    8    T{      |
       |            |               | MDP_CMD type (value = 2)         |
       +------------+---------------+----------------------------------+
       | object_id  |      32       | MdpObjectTransportId of the      |
       |            |               | MdpObject for which repair       |
       |            |               | requests should be terminated.   |
       +------------+---------------+----------------------------------+

      The "sequence" and "flavor" fields serve the purposes previously
      described.

      The "object_id" field indicates the MdpObject for which the
      receiver should stop requesting repair.

      The MDP_CMD_ACK_REQ command can optionally be used by an MDP source
      to request explicit positive object receipts from a subset of
      receivers.  This message contains the following fields in addition
      to the MDP common message header:


Adamson, Macker            Expires April 2000                  [Page 19]


Internet Draft    The Multicast Dissemination Protocol      October 1999


       +------------+---------------+----------------------------------+
       |   Field    | Length (bits) |             Purpose              |
       +------------+---------------+----------------------------------+
       | sequence   |      16       | Packet loss detection sequence   |
       |            |               | number                           |
       +------------+---------------+----------------------------------+
       | flavor     |      16       | MDP_CMD type (value = 3)         |
       +------------+---------------+----------------------------------+
       | object_id  |      32       | MdpObjectTransportId of MdpOb-   |
       |            |               | ject for which positive acknowl- |
       |            |               | edgement of complete reception   |
       |            |               | is requested.                    |
       +------------+---------------+----------------------------------+
       | data       |      --       | List of receiver MdpNodeIds from |
       |            |               | which positive acknowledgement   |
       |            |               | is requested (source "seg-       |
       |            |               | ment_size" is maximum "data"     |
       |            |               | size)                            |
       +------------+---------------+----------------------------------+

      The "sequence" and "flavor" fields serve the purposes previously
      described.

      The "object_id" field indicates the MdpObject for which the indi-
      cated receivers should provide positive acknowledgement (via an
      MDP_ACK message) of reception.

      The "data" field contains a list of MdpNodeIds from which the
      source expects to receive positive acknowledgement of reception.
      The maximum size of the "data" field is limited by the source's
      "segment_size" setting.  Thus for large group sizes, it is possible
      that the positive acknowledgment process may take multiple
      "rounds".  This process is described in detail in a later section
      of this document.

3.3.6.3 MDP_CMD_GRTT_REQ

      The MDP_CMD_GRTT_REQ command is periodically transmitted by an
      active source in order to collect responses from receivers to
      attain a running estimate of round trip packet transmission delays
      and other statistics for protocol operation.  The process by which
      the MDP_CMD_GRTT_REQ messages are sent and how responses are
      obtained from receivers for modes of operation with and without
      dynamic congestion control enabled is described in detail later.
      This message type contains the following fields in addition to the
      MDP common message header:


Adamson, Macker            Expires April 2000                  [Page 20]


Internet Draft    The Multicast Dissemination Protocol      October 1999


       +------------+---------------+----------------------------------+
       |   Field    | Length (bits) |             Purpose              |
       +------------+---------------+----------------------------------+
       | sequence   |      16       | Packet loss detection sequence   |
       |            |               | number                           |
       +------------+---------------+----------------------------------+
       | flavor     |      16       | MDP_CMD type (value = 3)         |
       +------------+---------------+----------------------------------+
       | flags      |       8       | GRTT request flags               |
       +------------+---------------+----------------------------------+
       | grtt_seq   |       8       | GRTT_REQ sequence identifier     |
       +------------+---------------+----------------------------------+
       | send_time  |      64       | Timestamp reference of when this |
       |            |               | message was sent by the source.  |
       +------------+---------------+----------------------------------+
       | hold_time  |      64       | Receiver response window (time   |
       |            |               | window over which receivers      |
       |            |               | should spread their responses)   |
       +------------+---------------+----------------------------------+
       | tx_rate    |      32       | Current source transmit rate     |
       |            |               | (bytes/sec)                      |
       +------------+---------------+----------------------------------+
       | rtt        |       8       | Bottleneck node round trip time  |
       |            |               | estimate                         |
       +------------+---------------+----------------------------------+
       | loss       |      16       | Bottleneck node packet loss      |
       |            |               | estimate                         |
       +------------+---------------+----------------------------------+
       | data       |      --       | List of representative MdpN-     |
       |            |               | odeIds from which explicit (non- |
       |            |               | wildcard) acknowledgement is     |
       |            |               | requested (source "segment_size" |
       |            |               | is maximum "data" size)          |
       +------------+---------------+----------------------------------+

      The "sequence" and "flavor" fields serve the purposes previously
      described.

      The "flags" field currently has one possible flag value defined.
      This flag is the MDP_CMD_GRTT_FLAG_WILDCARD (value = 0x01) flag
      which is used during MDP congestion control operation to indicate
      MDP_CMD_GRTT_REQ messages to which _all_ MDP receivers (regardless
      of their "representative" status) should explicitly respond via an
      MDP_ACK message.

      The "grtt_seq" field is a sequence number which is incremented each
      time the source transmits an MDP_CMD_GRTT_REQ command.  This field
      is used in responses from receivers to identify which specific


Adamson, Macker            Expires April 2000                  [Page 21]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      MDP_CMD_GRTT_REQ message to which the receiver response applies.

      The "send_time" field is precision timestamp indicating the time
      that the MDP_CMD_GRTT_REQ message was transmitted.  This consists
      of a 64-bit field containing 32-bits with the time in seconds and
      32-bits with the time in microseconds since some reference time the
      source maintains (usually 00:00:00, 1 January 1970).

      The "hold_time" field is in the same format as the "send_time"
      field. (Note: It is likely that this will be quantized to an 8-bit
      value in a future revision using the same algorithm previously
      described for source GRTT advertisements in other messages.)  The
      "hold_time" instructs receivers over what window of time they
      should distribute any explicit responses to the MDP_CMD_GRTT_REQ
      command.

      The "tx_rate" field indicates the source's current transmission
      rate in units of bytes per second.  This information is used by
      receivers as part of MDP's rate-based congestion control algorithm
      which is described in detail later in this document.

      The "rtt" field indicates the round trip delay time measured for
      the current "bottleneck" congestion control representative node.
      This information is used by receivers as part of MDP's congestion
      control algorithm which is described in detail later.  This 8-bit
      value is a quantized representation of the delay using the same
      quantization algorithm described for the GRTT estimate advertised
      in MDP_INFO, MDP_DATA, and MDP_PARITY messages.

      The "loss" field indicates the loss fraction measured for the cur-
      rent "bottleneck" congestion control representative node.  This
      16-bit value represents the loss fraction on scale of 0.0 to 1.0
      where the decimal loss fraction can be obtained from the formula:

                       loss_fraction = "loss" / 65535.0

      This information is also used by receivers as part of MDP's conges-
      tion control algorithm which is described in detail later.

      The "data" field of the MDP_CMD_GRTT_REQ message contains a list of
      MdpNodeIds indicating the receiver nodes which are currently
      selected by the source to serve as congestion control representa-
      tives.  These listed nodes should explicitly respond to the
      MDP_CMD_GRTT_REQ with an MDP_ACK message randomly within the
      "hold_time" indicated.  More details on the congestion control
      approach are described later.

3.3.7 MDP_NACK


Adamson, Macker            Expires April 2000                  [Page 22]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      MDP_NACK messages are transmitted by MDP receivers in response to
      the detection of missing data in the sequence of transmissions
      received from a particular source.  The specific times and condi-
      tions under which receivers will generate and transmit these
      MDP_NACK messages are governed by the processes described in detail
      later in this document.  The payload of MDP_NACK messages contains
      a list of "ObjectNACKs" for different objects and portions of a
      those objects.  In addition to the common message header the
      MDP_NACK messages contain the following fields:


   +--------------------+---------------+----------------------------------+
   |       Field        | Length (bits) |             Purpose              |
   +--------------------+---------------+----------------------------------+
   | server_id          |      32       | MdpNodeId of source for which    |
   |                    |               | NACK is intended                 |
   +--------------------+---------------+----------------------------------+
   | grtt_response      |      64       | Response to source's             |
   |                    |               | MDP_CMD_GRTT_REQ, if any (zero   |
   |                    |               | value if none)                   |
   +--------------------+---------------+----------------------------------+
   | loss_estimate      |      16       | Current packet loss estimate for |
   |                    |               | the indicated source.            |
   +--------------------+---------------+----------------------------------+
   | grtt_req_sequence  |       8       | Sequence number identifier of    |
   |                    |               | applicable MDP_CMD_GRTT_REQ      |
   +--------------------+---------------+----------------------------------+
   | data               |      --       | ObjectNACK list                  |
   +--------------------+---------------+----------------------------------+

      The "server_id" field identifies the source to which the MDP_NACK
      message is destined.  Other sources should ignore this message.
      (Note that this another reason why multiple potential sources
      within an MDP session MUST have unique MdpNodeIds).

      The "grtt_response" field contains a timestamp indicating the time
      at which the MDP_NACK was transmitted.  The format of this times-
      tamp is the same as the "send_time" field of the MDP_CMD_GRTT_REQ.
      However, note that the "grtt_response" timestamp is _relative_ to
      the "send_time" the source provided with the corresponding
      MDP_CMD_GRTT_REQ command.  The receiver adjusts the source's
      MDP_CMD_GRTT_REQ "send_time" timestamp by the time differential
      from when the receiver received the MDP_CMD_GRTT_REQ to when the
      MDP_NACK was transmitted to calculate the value in the
      "grtt_response" field.  The following formula applies:

      "grtt_response" = request "send_time" + request_to_response_differential


Adamson, Macker            Expires April 2000                  [Page 23]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      If the "grtt_response" has ZERO value, that is an indication that
      the receiver has not yet received a MDP_CMD_GRTT_REQ command from
      the source and the source should ignore this portion of the
      response.

      The "loss_estimate" field is the receiver's current packet loss
      fraction estimate for the indicated source.  The loss fraction is a
      value from 0.0 to 1.0 corresponding to a range of zero to 100 per-
      cent packet loss. The 16-bit "loss_estimate" value is calculated by
      the following formula:

               "loss_estimate" = decimal_loss_fraction * 65535.0

      The "grtt_req_sequence" field contains the sequence number identi-
      fier of the received MDP_CMD_GRTT_REQ to which the response infor-
      mation in this MDP_NACK applies.

      The "data" field of the MDP_NACK contains the list of ObjectNACKs
      for different source MdpObjects.  Note that ObjectNACKs for multi-
      ple objects may be contained in one MDP_NACK message and that each
      ObjectNACK consists of a hierarchical set of indicators and bit
      masks depending upon what data the receiver has detected is miss-
      ing.  Each ObjectNACK in the list contained in the MDP_NACK "data"
      field is made up of the following fields:


       +------------+---------------+----------------------------------+
       |   Field    | Length (bits) |             Purpose              |
       +------------+---------------+----------------------------------+
       | object_id  |      32       | MdpObjectTransportId of object   |
       |            |               | for enclosed RepairRequests      |
       +------------+---------------+----------------------------------+
       | nack_len   |      16       | Total length (in bytes) of       |
       |            |               | RepairRequests for indicated     |
       |            |               | object.                          |
       +------------+---------------+----------------------------------+
       | data       |      --       | RepairRequest list               |
       +------------+---------------+----------------------------------+

      The content in the data field of an ObjectNACK consists of a list
      of individual "RepairRequests" for the indicated MdpObject.  There
      are multiple types of RepairRequests which each begin with an 8-bit
      type field of one of the following values:


Adamson, Macker            Expires April 2000                  [Page 24]


Internet Draft    The Multicast Dissemination Protocol      October 1999


         +------------------+------+----------------------------------+
         |  RepairRequest   | Type |             Purpose              |
         +------------------+------+----------------------------------+
         |                  |      |                                  |
         +------------------+------+----------------------------------+
         | REPAIR_SEGMENTS  |  1   | Indicates receiver is missing    |
         |                  |      | portions of an encoding block.   |
         +------------------+------+----------------------------------+
         | REPAIR_BLOCKS    |  2   | Indicates receiver is missing    |
         |                  |      | some blocks in entirety.         |
         +------------------+------+----------------------------------+
         | REPAIR_INFO      |  3   | Indicates receiver requires      |
         |                  |      | retransmission of MDP_INFO for   |
         |                  |      | object.                          |
         +------------------+------+----------------------------------+
         | REPAIR_OBJECT    |  4   | Indicates receiver requires      |
         |                  |      | retransmission of entire object  |
         +------------------+------+----------------------------------+

      A REPAIR_SEGMENTS RepairRequest identifies the beginning the coding
      block (by its offset) and then provides a bit mask indicating which
      segments within that block require retransmission.  A count of the
      total number of missing segments (erasures) is also provided.  So,
      the following fields comprise a REPAIR_SEGMENTS RepairRequest:


     +-----------+---------------+-----------------------------------------+
     |  Field    | Length (bits) |                 Purpose                 |
     +-----------+---------------+-----------------------------------------+
     | type      |       8       | value = 1 (REPAIR_SEGMENTS)             |
     +-----------+---------------+-----------------------------------------+
     | nerasure  |       8       | Count of missing segments in the block. |
     +-----------+---------------+-----------------------------------------+
     | offset    |      32       | Offset of applicable coding block.      |
     +-----------+---------------+-----------------------------------------+
     | mask_len  |      16       | Length of attached bit mask (in bytes)  |
     +-----------+---------------+-----------------------------------------+
     | mask      |      --       | Bit mask content                        |
     +-----------+---------------+-----------------------------------------+

      The REPAIR_BLOCKS RepairRequest identifies the beginning of a set
      of FEC coding blocks (by the initial offset) and then provides a
      bit mask indicating which coding blocks require retransmission in
      entirety.  The following fields make up a REPAIR_BLOCKS RepairRe-
      quest:


Adamson, Macker            Expires April 2000                  [Page 25]


Internet Draft    The Multicast Dissemination Protocol      October 1999


     +-----------+---------------+----------------------------------------+
     |  Field    | Length (bits) |                Purpose                 |
     +-----------+---------------+----------------------------------------+
     | type      |       8       | value = 2 (REPAIR_BLOCKS)              |
     +-----------+---------------+----------------------------------------+
     | offset    |      32       | Offset of initial coding block         |
     +-----------+---------------+----------------------------------------+
     | mask_len  |      16       | Length of attached bit mask (in bytes) |
     +-----------+---------------+----------------------------------------+
     | mask      |      --       | Bit mask content                       |
     +-----------+---------------+----------------------------------------+

      The REPAIR_INFO RepairRequest implicitly identifies by its type
      that the receiver requires retransmission of the MDP_INFO associ-
      ated with an object and thus consists of a single byte:


          +-------+---------------+----------------------------------+
          |Field  | Length (bits) |             Purpose              |
          +-------+---------------+----------------------------------+
          | type  |       8       | value = 3 (REPAIR_INFO)          |
          +-------+---------------+----------------------------------+

      The REPAIR_OBJECT RepairRequest is also very simple and also con-
      sists of a single byte to request retransmission of an entire MDP
      transport object:


          +-------+---------------+----------------------------------+
          |Field  | Length (bits) |             Purpose              |
          +-------+---------------+----------------------------------+
          | type  |       8       | value = 4 (REPAIR_OBJECT)        |
          +-------+---------------+----------------------------------+

3.3.8 MDP_ACK

      The MDP_ACK message type is used by MDP receivers to provide posi-
      tive acknowledgement in response to certain commands transmitted by
      an MDP source.  Generally, the source will persistently retransmit
      commands requiring positive acknowledgement until sufficient
      acknowledgement is collected.  In addition to the MDP common mes-
      sage header, the following fields make up MDP_ACK messages:


Adamson, Macker            Expires April 2000                  [Page 26]


Internet Draft    The Multicast Dissemination Protocol      October 1999


   +--------------------+---------------+----------------------------------+
   |       Field        | Length (bits) |             Purpose              |
   +--------------------+---------------+----------------------------------+
   | server_id          |      32       | MdpNodeId of source for which    |
   |                    |               | NACK is intended                 |
   +--------------------+---------------+----------------------------------+
   | grtt_response      |      64       | Response to source's             |
   |                    |               | MDP_CMD_GRTT_REQ, if any (zero   |
   |                    |               | value if none)                   |
   +--------------------+---------------+----------------------------------+
   | loss_estimate      |      16       | Current packet loss estimate for |
   |                    |               | the indicated source.            |
   +--------------------+---------------+----------------------------------+
   | grtt_req_sequence  |       8       | Sequence number identifier of    |
   |                    |               | applicable MDP_CMD_GRTT_REQ      |
   +--------------------+---------------+----------------------------------+
   | type               |       8       | Type of MDP_ACK message.         |
   +--------------------+---------------+----------------------------------+
   | object_id          |      32       | Applicable MdpObjectTransportId  |
   |                    |               | object, if any                   |
   +--------------------+---------------+----------------------------------+

      The "server_id", "grtt_response", "loss_estimate", and
      "grtt_req_sequence" fields serve the same purpose as the corre-
      sponding fields in MDP_NACK messages.

      The "type" field identifies the type of MDP_ACK and is one of the
      following values:


         +------------------+------+----------------------------------+
         |MDP_ACK Variation | Type |             Purpose              |
         +------------------+------+----------------------------------+
         | MDP_ACK_OBJECT   |  1   | Positive acknowledgement of      |
         |                  |      | receipt of a particular trans-   |
         |                  |      | port object.                     |
         +------------------+------+----------------------------------+
         | MDP_ACK_GRTT     |  2   | Indicates that the MDP_ACK is    |
         |                  |      | simply a response to a GRTT_REQ  |
         |                  |      | command.                         |
         +------------------+------+----------------------------------+

      The MDP_ACK_OBJECT acknowledgement type is used to indicate that
      the receiver has successfully received all transport objects up to
      and including the object sequence number identified in the
      "object_id" field.

      Like MDP_NACK messages, all MDP_ACK responses from receivers


Adamson, Macker            Expires April 2000                  [Page 27]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      contain an embedded response to GRTT_REQ commands from MDP sources.
      However, the MDP_ACK_GRTT acknowledgement type is also provided to
      support explicit collection of a GRTT estimate from the group (or
      potentially a subset of the group).  This is used in MDP's conges-
      tion control algorithm that is described in  detail later.

4.0 Detailed Protocol Operation

      The following sequence of events roughly describes the general,
      steady-state operation of the MDP protocol from the perspective of
      a source transmitting data to a group of receivers.  This sequence
      of events can be used as a guide when reading the subsequent
      detailed descriptions of the individual portions of the protocol.

        1)   The source periodically sends out MDP_CMD_GRTT_REQUEST
             probes and transmits messages of type MDP_INFO, MDP_DATA and
             optionally some amount of  MDP_PARITY.  The "object_id" and
             "offset" fields of these messages monotonically increase in
             sequence.

        2)   Receivers "synchronize" to the source upon receipt of
             MDP_INFO, MDP_DATA, or MDP_PARITY.  Receivers will not
             request repair for objects in sequence prior to the point of
             synchronization.

        3)   Receivers monitor the sequence of transmission for any
             "missing" data and initiate NACK repair cycles using the
             algorithms described below.

        4)   The source aggregates the content of received repair
             requests and transmits appropriate repair, normally using
             messages of type MDP_PARITY.

        5)   Transmission of new data is completely interleaved with
             repair transmissions so the source has no "dead" time.

        6)   When congestion control is enabled, a dynamically changing
             subset of the receivers are instructed to quickly, explic-
             itly ACK the MDP_CMD_GRTT_REQUEST probes.  Feedback can also
             be requested from the entire group over a longer period of
             time.

        7)   The source may also initiate optional positive acknowledge-
             ment from a subset (or possibly the entire) group of
             receivers.

4.1 Source Transmission


Adamson, Macker            Expires April 2000                  [Page 28]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      In the current MDP implementation, protocol activity within a ses-
      sion is initiated by the transmission of data by a source node.
      The data is comprised of serialized segments of objects enqueued
      for transmission by the source application.  An object is currently
      defined as a static data of fixed and pre-determined size stored in
      a file or in memory at the source node.  In the future, support for
      stream objects of indefinite size will likely be supported in the
      MDP toolkit but we reserve that discussion for another time.  The
      rate and format of transmission of the data content of an MDP
      object is determined by a number of source protocol parameters set
      by the application.  These parameters include the "transmit_rate",
      "segment_size", availability of out-of-band information (MDP_INFO)
      for an object, "block_size", "max_parity", and "auto_parity".

      The "transmit_rate" parameter governs the peak rate at which data
      is transmitted by a source MDP node in units of bits per second.
      If the MDP application has data enqueued for transmission, the
      source will transmit packets in aggregate at or below this applica-
      tion-defined "transmit_rate".  The total transmissions by the
      source, including the data, repairs, and commands, are governed by
      this parameter.

      The "segment_size" parameter determines the maximum MDP message
      payload size the source uses for transmissions.  (Note that UDP
      packet payload sizes will be slightly larger than the "seg-
      ment_size" setting since there is additional MDP protocol overhead
      in the message formats previously described.)  MDP transport
      objects are fragmented into MDP_DATA messages of "segment_size"
      bytes.  Note that where the transmitted object does not fragment
      into an exact number of "segment_size" messages, short ("runt")
      MDP_DATA messages will be transmitted at the end of an object.

      The MDP source application has the option of setting and advertis-
      ing a small amount of out-of-band information ("info") for each
      object enqueued for transmission.  For example, in a file transfer
      application, MIME-type information and/or name identification for
      file content might be embedded in the "info" portion of an MDP
      transport object.  The amount of "info" may be up to "segment_size"
      bytes according to the source's settings.  Thus the "info" associ-
      ated with an object can be transmitted in a single MDP message.  As
      will be discussed later, this allows for more responsive repair
      than that of the bulk data content.  If the "info" is set, the
      transmission of an object is initiated by sending an MDP_INFO mes-
      sage.  This is followed by the object data content as follows.

      The "block_size" and "max_parity" parameters affect how the source
      calculates, maintains, and transmits parity-based repair messages.
      The present MDP design uses shortened, 8-bit symbol-based Reed


Adamson, Macker            Expires April 2000                  [Page 29]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      Solomon encoding methods to construct repair vectors (packets)
      based on a block of data vectors (packets).  The "block_size"
      parameter corresponds to the number of MDP_DATA messages per Reed-
      Solomon encoding block while the "max_parity" parameter corresponds
      to the number of repair (MDP_PARITY) messages the source calculates
      and maintains per block.  So, for standard (N,k) nomenclature to
      describe the resulting shortened Reed-Solomon code, N = (block_size
      + max_parity) and k = block_size.  Note that, in addition to the
      ending "runt" MDP_DATA message, it is likely that transport objects
      will not often fragment to an exact number of encoding blocks.
      Thus, for the "short" ending block (containing less than
      "block_size" MDP_DATA segments), the MDP Reed-Solomon encoder
      assumes zero-padding of the "runt" message and the calculation of
      the parity vectors is truncated to a further shortened code for
      that block.  Note that the truncation of parity calculation does
      not impact the erasure repairing capabilities of the resulting
      code.

      In a current MDP implementation, the source incrementally calcu-
      lates and buffers parity information for the sequence of MDP_DATA
      messages it transmits.  At the end of each encoding block, the
      source MAY optionally transmit a number of MDP_PARITY repair mes-
      sages according to the value of the "auto_parity" parameter.  While
      this parameter is by default ZERO for pure reactive retransmission
      repairing, some network topologies, scenarios, and applications may
      benefit from implicit transmission or hybrid proactive/reactive
      repairing of lost packets.  The ability of any multicast receiver
      to fill any erasure within an encoding block with any one MDP_PAR-
      ITY packet allows for performance gains in some environments and
      the potential for robust data delivery in cases of uni-directional
      or asymmetric network connectivity (e.g.  broadcast satellite com-
      munication system).

      Transmitted coding blocks are identified by using the "offset"
      field contained within MDP_DATA and MDP_PARITY messages.  The fol-
      lowing integer calculation can be used to identify the coding block
      with which an MDP_DATA or MDP_PARITY message is associated:

            block_id = "offset" / ("block_size" * "segment_size");

      Recall that MDP_PARITY messages have the "parity_id" field to
      uniquely identify to which portion of the Reed-Solomon parity con-
      tent the given message corresponds.  This methodology allows
      receivers to efficiently maintain state for decoding of a block
      when a sufficient quantity of MDP_DATA and MDP_PARITY messages are
      received (i.e. a total of "block_size" unique MDP_DATA and/or
      MDP_PARITY messages for a given coding block).


Adamson, Macker            Expires April 2000                  [Page 30]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      The MDP source sequentially transmits transport objects with incre-
      mentally increasing transport "object_id" values.  The content
      position of MDP_DATA and MDP_PARITY messages is contained and
      implied in the value of the offset fields in those messages.  This
      sequencing information is used by MDP receivers to trigger repair
      requests at the end of each source encoding block and at the tran-
      sition of transmission from one object to the next.  Additionally,
      when the source reaches the end of the application enqueued trans-
      mission objects, it begins periodically transmitting MDP_CMD_FLUSH
      command messages to notify receivers of the end of an active trans-
      mission period and to additionally prompt them for repair requests.
      In the absence of subsequent enqueued object transmission, the
      MDP_CMD_FLUSH messages are sent once every 2*GRTT seconds until a
      repair request is received from a receiver or until a maximum probe
      count according to a preset robustness factor (FLUSH_ROBUST-
      NESS_COUNT) is reached (currently a default of 20 flush messages).

4.2 Receiver Synchronization

      Upon reception of MDP_INFO or MDP_DATA messages from a new source,
      an MDP receiver will "synchronize" with the MDP source by beginning
      to maintain state on the source with the object segmentation and
      encoding parameters and current transmission sequencing information
      embedded in the received messages.  For this reason, this informa-
      tion is embedded in all MDP_INFO, MDP_DATA, and MDP_PARITY messages
      transmitted by the source.  In the current MDP implementation, if
      new data is received from a source indicating a change in the value
      of these parameters, the receiver drops its current state on the
      source and re-synchronize to that source, so it is generally
      expected that these parameters will not change during the lifetime
      of an MdpSession.  Note that it is possible for multiple sources to
      co-exist within the context of an MDP session and that each source
      may maintain its own independent set of transmission parameters.
      MDP receivers also use their own "transmit_rate" parameter to gov-
      ern their peak rate of protocol transmissions.  However, the normal
      required quantity of transmissions from an MDP receiver is very
      low.

      The MDP implementation limits the conditions under which receivers
      will synchronize to a source to prevent the source from being
      restricted in forward transmission progress in environments with
      very large group sizes and active group join/leave dynamics.  For
      example, at present receivers will not synchronize to source upon
      receipt of MDP_DATA messages which are part of a repair transmis-
      sion or on the receipt of MDP_PARITY messages and receivers are
      currently only allowed to synchronize during data transmission of
      the first encoding block of a transport object.  Also, receivers
      will not request repairs for objects earlier in sequence than the


Adamson, Macker            Expires April 2000                  [Page 31]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      object to which they established synchronization to the source.
      MDP implementations may desire further refinement of these synchro-
      nization policy features for different applications and require-
      ments.

      The MDP receiver maintains a "synchronization window" where if the
      current transmission sequence from the source exceed the bounds of
      this window, the receiver will re-synchronize with the source and
      not attempt further repair of earlier objects.  The constraints on
      this window can be established according to application policy
      and/or the amount of buffering space the application is willing to
      allocate for a given MdpSession and specific source.  Note that for
      efficiency, this window should well-exceed the expected worst-case
      delay-bandwidth product for the network topology the group is uti-
      lizing.

4.3 Receiver NACK Process

      Once a receiver has "synchronized" with a source, it begins track-
      ing the sequence of transmission using the "object_id" and "offset"
      fields contained in the data and commands sent by the source.  If
      the receiver detects missing data from the source at the end of an
      encoding block, end of an object transmission, or upon receipt of
      an MDP_CMD_FLUSH command, it initiates a process to request repair
      transmissions from the source.  Note that the end-of-block or end-
      of-object boundaries are detected either explicitly by the
      MDP_FLAG_BLOCK_END indicator in a received message or implicitly by
      the receipt of data beyond the last incompletely received block.
      The repair cycle for requesting retransmission of missing MDP_INFO
      for an object can be begun immediately since it is "out-of-band" to
      the MDP parity encoding process.  MDP receivers should consider the
      MDP_INFO content to be the first "virtual" block of the correspond-
      ing MdpObject.

      The receiver-initiated repair process will also begin upon a
      longer-term timeout based on lack of received packets from a previ-
      ously-active source.  This longer-term time out should be set to
      (2.0 * GRTT * FLUSH_ROBUSTNESS_COUNT) which corresponds to the
      period of MDP_CMD_FLUSH message transmission which is conducted
      prior to a source going inactive.  Reciever implementations SHOULD
      set reasonable bounds on minimum and maximum values for this source
      "inactivity" timeout.  Receivers SHOULD also limit the number of
      "inactivity" timeout refreshes so as not to go into a mode of infi-
      nite NACKing in the case where the server or network connectivity
      has completely failed.

      To initiate the repair request process and to facilitate the sup-
      pression of redundant NACK responses, the receiver begins a random


Adamson, Macker            Expires April 2000                  [Page 32]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      hold-off timeout to delay immediate response to a source node upon
      detecting loss.  Thus, if another NACK for the same (or more)
      repair information arrives at the receiver (or the repair informa-
      tion itself) before the timeout ends, the receiver will suppress
      its transmission of an MDP_NACK message.  Note that after transmis-
      sion or suppression of the NACK occurs, another timeout is used to
      allow some amount of time for the source to respond to the repair
      request before again initiating an additional repair request pro-
      cess.  The initial hold-off timeout is randomly picked from a expo-
      nential distribution from ZERO to GRTT seconds.  For large multi-
      cast group sizes, this generally allows for a significant level of
      NACK suppression while maintaining reasonably small delays in the
      repair of data transmissions.  The extension of the potential hold-
      off window to the order of (1 * GRTT) allows for general worst-case
      receiver-to-receiver transmission delays assuming symmetric unicast
      routing among nodes in the multicast group. The secondary hold-off
      timeout after NACK transmission/suppression is fixed at (4 * GRTT)
      to allow reasonable time for the source to receive a NACK, possibly
      aggregate multiple NACKs, and to begin providing repair messages
      back to the receiver.

      As the receiver has received transmissions of MDP_INFO, MDP_DATA,
      and MDP_PARITY from a source, it attempts to have maintained the
      state of completion for all received objects.  The receiver may be
      buffer-limited, so priority is given to the earliest objects within
      the "synchronization window" previously described.  Additionally,
      the receiver keeps track of the most recently detected "object_id"
      and "offset" sequencing index information received from the source.
      For each incomplete transport object up to this current transmis-
      sion index, the receiver constructs an ObjectNACK which is included
      in the payload of the MDP_NACK message.  It is critical to the
      efficiency and convergence of the protocol that the NACK content
      only consist of repair requests for transmissions sequences earlier
      than the most recent detected source transmission position.  This
      prevents receivers from redundantly requesting repair for data the
      source be already intending to transmit (e.g. based on repair
      requests from other receivers that the receiver in question did not
      receive).  The effect of this controlled NACK process is to
      "rewind" the source to the earliest required repair point without
      redundant requests for repair being serviced.  This sequencing
      keeps the new data transmission and repair process within a minimal
      bound of source and receiver buffer space given the the current
      delay-bandwidth product of the network topology.  (The GRTT mea-
      surement process and subsequent timers based on GRTT work to maxi-
      mize the efficiency of transmission and NACK suppression while
      maintaining minimal repair latency.)

      The content of receiver MDP_NACK messages depend upon the repair


Adamson, Macker            Expires April 2000                  [Page 33]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      needs of the requesting receiver.  The ObjectNACK consists of a
      list of RepairRequests for retransmission of repair messages for
      missing data segments within individual encoding blocks, retrans-
      mission of coding blocks in entirety, retransmission of the info
      content, or possibly retransmission of the entire object.  Entire
      object retransmission is requested with the MDP_REPAIR_OBJECT
      RepairRequest previously described.  Otherwise, partial repairs of
      the object are requested using a combination of the MDP_REPAIR_SEG-
      MENTS, MDP_REPAIR_BLOCKS, and MDP_REPAIR_INFO RepairRequests.  Note
      that these different types of repair requests can be viewed as a
      hierarchy to elicit different types of repair transmission behav-
      ior. These RepairRequests are constructed as follows:

      The MDP_REPAIR_SEGMENTS RepairRequest identifies the encoding block
      to which the RepairRequest applies, the total number of erasures
      (missing data segments) in the block, and provides a bit mask indi-
      cating which specific segments the receiver requires to repair the
      encoding block.  The MDP protocol leverages the use of parity-based
      repair by requesting transmission of parity repair messages when-
      ever possible.  For example, if the source uses a coding setting of
      20 data segments per coding block (block_size = 20) and calculates
      20 parity segments per block (max_parity = 20), the receiver will
      always request transmission of parity for repair.  Only when the
      receiver is missing a greater number of data segments than avail-
      able parity will the receiver request explicit retransmission of
      data segments.  And then, the receiver only requests retransmission
      of the minimal number of data segments (those with the highest off-
      set values) to repair what parity alone cannot cover.  This method-
      ology allows the source to transmit a minimal number of  redundant
      data packets and leverages the use of parity packet erasure repair-
      ing  since any one parity segment can repair any one missing data
      segment at any  receiver.  Thus, even if some requested parity
      packets are lost during the  source's transmission of repair, some
      receivers (those who observed less than  the group maximum loss)
      may likely be able to completely repair a block from the  combina-
      tion of received repair messages.  This is also why all
      MDP_REPAIR_SEGMENTS RepairRequests contain the explicit bit mask
      marking specific missing segments in addition to the erasure (miss-
      ing segment) total count for the applicable coding block.

      Another feature of MDP is that the source attempts to send previ-
      ously untransmitted parity segments whenever  possible.  This also
      provides efficiency gains since receivers are not required to
      request and receive explicit segments (data or parity) which they
      are missing on subsequent iterative repair cycles.  This approach
      provides benefit when parity packets may get lost or dropped and
      the source does not know which receivers missed which parity pack-
      ets during subsequent repairs. The source makes use of the erasure


Adamson, Macker            Expires April 2000                  [Page 34]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      count provided in the receiver MDP_REPAIR_SEGMENTS RepairRequests
      to efficiently perform this function which is why the count is pro-
      vided in addition to the bit mask content.  The length of the bit
      mask content of the RepairRequest is equal to (block_size +
      max_parity) bits padded out to an integral number of bytes.  A
      value of one in the bit mask marks the requested segment.  The
      "mask_len" field is provided so that other nodes can easily and
      safely parse the content of the MDP_NACK.

      The MDP_REPAIR_BLOCKS RepairRequest is used by receivers to request
      retransmission of coding blocks missing in entirety.  In typical
      applications of the protocol this should occur infrequently such as
      in cases of intermittent network outages, during the "short" coding
      blocks at the end of object transmissions (or very small objects),
      and/or in cases of very severe packet loss.  The format of the
      MDP_REPAIR_BLOCKS RepairRequest is similar to the MDP_REPAIR_SEG-
      MENTS.  An "offset" field is used to indicate the first coding
      block (as computed using the "block_id" formula presented earlier)
      and a bit mask is provided with values of one indicating which
      blocks require retransmission.  The source retransmits the entire
      set of data segments for the encoding blocks requested including
      any configured quantity of "auto_parity".

      The MDP_REPAIR_INFO RepairRequest is used by the receiver to
      request retransmission of the available info the source has
      attached to the transport object.  This retransmission is only
      requested if the source has advertised the availability of info for
      the object via the MDP_FLAG_INFO flag in other messages transmitted
      for the given transport object.

      The MDP_REPAIR_OBJECT RepairRequest indicates that the receiver
      requires retransmission of an entire transport object.  As with the
      MDP_REPAIR_BLOCKS request, this will typically be a rare occur-
      rence, except in the case of very small objects (very few "seg-
      ment_size" or less in length) and/or intermittent network outages
      or heavy packet loss.  MDP receivers maintain a check on the
      integrity of the sequencing of transport object id's from a source
      in order to make these requests.  This allows MDP to treat the
      sequence of MdpObjects as a "pseudo-stream" of transmission for
      which integrity must be maintained.  For some applications in large
      scale, loosely-controlled data distribution environments, it may be
      beneficial to have an option to disable this degree of reception
      integrity checking.  The current MDP implementation maintains a
      window depth for this integrity check before resynchronizing to a
      source, plus the source maintains a finite history of data it main-
      tains for retransmission.  These parameters are or will be settable
      in MDP implementations.


Adamson, Macker            Expires April 2000                  [Page 35]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      Once an MDP_NACK message has been transmitted (or a  decision to
      suppress transmission has been made), the receiver inhibits itself
      from initiating another repair request cycle for the given source
      for a period of (4*GRTT) seconds based on the GRTT estimate adver-
      tised by the source.  This allows time for the repair request to
      propagate to the source, for the source to aggregate possible
      MDP_NACKs received from multiple receivers, and for the  source
      responses to the repair request(s) to begin being received.

4.4 Source NACK Aggregation and Repair

      Upon receipt of an MDP_NACK for a specific object from a receiver,
      the source parses and records the RepairRequests and begins a hold-
      off timeout for a period of (2*GRTT) seconds before it responds to
      the repair requests.  This allows sufficient time to receive and
      aggregate possible additional RepairRequests from other receivers.
      The reason for the (2*GRTT) hold-off time is to consider the worst-
      case condition where a receiver very near the source immediately
      (random NACK backoff timeout of ZERO seconds) sends an MDP_NACK to
      begin a repair cycle while a receiver very far from the source (up
      to possibly one GRTT away in worst-case asymmetry) has a random
      NACK backoff timeout of GRTT seconds.  Note that during this repair
      response hold-off time, the source will still continue to transmit
      data for new or other objects pending repair.

      The exception to the hold-off timeout is that retransmission of
      MDP_INFO messages occurs almost immediately upon receipt of the
      MDP_NACK message.  Note, however, that repeat retransmission of
      duplicate MDP_INFO is restricted to once per (2*GRTT).  The reason
      that the source can retransmit the MDP_INFO repair quickly is
      because there is no need to aggregate multiple repair requests to
      make a determination on what to transmit for maximum efficiency.
      This added responsiveness to the repair cycle for MDP_INFO messages
      makes this out-of-band control information potentially useful in
      the context of reliable multicast session control or for certain
      types of multicast application data.

      Once the repair aggregation hold-off timeout has ended the source
      MUST transmit repair information beginning with the lowest ordinal
      sequence transport object and coding block.  It is critical to the
      convergence of the protocol that the repair transmissions be con-
      ducted in this order.  Strict adherence to the ordering of repair
      allows repairs to be conducted within the constraints of
      source/receiver state buffering.  Note that having a good approxi-
      mation of GRTT lets repairs be conducted in the most efficient and
      timely manner possible.  The NACK process is designed to force the
      source to "rewind" to the earliest possible repair position in the
      sequence of transmission so it does not move too far "ahead" of


Adamson, Macker            Expires April 2000                  [Page 36]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      receivers suffering loss.

      It is possible that, due to packet loss patterns, processing delays
      or another anomalies, MDP_NACK messages may arrive "late" from some
      receivers after repair transmission of a transport object has
      already begun.  The MDP source MUST immediately incorporate these
      late-arriving repair requests into the actively transmitting object
      as appropriate and continue with the repair transmissions.  The
      appropriate incorporation of late-arriving requests is to _only_
      mark segments or blocks greater than the source's current position
      in the sequence (segment offset, object_id) of transmission.  Then,
      as needed, receivers will initiate new repair cycles to recover
      information lost during repair transmissions.

      An important feature of the MDP protocol is that the source maxi-
      mizes the use of the parity segments it has calculated for repair
      transmissions.  For example, if during an initial repair cycle for
      an object, receivers have requested only a portion of the available
      parity segments, the source will use parity segments from the
      remaining unused portion for repair transmissions during subsequent
      repair cycles for the same encoding block.  There is a significant
      gain to this approach since the receiver parity decoding process
      can fill a certain number of missing data segments (erasures) with
      any combination of the same number of parity segments.  Thus, when
      multiple repair cycles are required to complete reliable transmis-
      sion of an encoding block, receivers are freed from the difficulty
      of requesting an explicit set of parity segments due to lost parity
      transmission.  With a sufficient number of parity segments calcu-
      lated by the source and nominal packet loss, the source may never
      need to send the same segments twice, thus maximizing the use of
      the parity information and minimizing the reception of redundant
      data among receivers.

4.5 General GRTT Collection Process (without congestion control)

      To facilitate more efficient protocol operation over different net-
      work topologies with varying end-to-end delay characteristics, the
      MDP protocol dynamically collects information and estimates the
      greatest round trip time (GRTT) packet propagation delay from the
      source to the other receiver nodes participating in the reliable
      multicast session.  This information is collected and the GRTT is
      estimated in the following manner.

      The source periodically transmits an MDP_CMD_GRTT_REQ containing a
      timestamp (relative to an internal clock at the source).  Receivers
      record the timestamp ("send_time") of the latest MDP_CMD_GRTT_REQ
      received from a source and the time at which the request was
      received (recv_time).  These times are used by receivers to


Adamson, Macker            Expires April 2000                  [Page 37]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      construct a response.  When the receiver responds to a
      MDP_CMD_GRTT_REQ, it embeds a timestamp in the response message
      calculated with the following formula:

          "grtt_response" = "send_time" + (current_time - recv_time)

      where the "send_time" is the timestamp from the last
      MDP_CMD_GRTT_REQ received from the source and the (current_time -
      recv_time) is the amount of time differential since that request
      was received until the receiver generated this response.  In the
      current MDP implementation this "grtt_response" field is contained
      within MDP_NACK and MDP_ACK messages so that in the general NACK-
      based operation of the protocol, only receivers sending MDP_NACK
      messages for repair requests contribute to the estimation of
      source-to-group GRTT.  If a receiver is experiencing perfect recep-
      tion and never NACKs, it will not participate in the GRTT-driven
      repair process anyway.  This allows for relatively efficient and
      scalable collection of round trip estimates from the pertinent mem-
      bers of the group (those with the worst packet loss).  The protocol
      message formats and code base does support an option to have all or
      a designated subset of receivers explicitly acknowledge the
      MDP_CMD_GRTT_REQ messages so that more accurate estimate of total
      group GRTT can be collected.  This collection method option is uti-
      lized in when the MDP rate-based congestion control algorithm is
      enabled.

      The source processes the GRTT response by calculating a current
      round trip estimate for the receiver from whom the response was
      received using the following formula:

                 receiver_rtt = current_time - "grtt_response"

      During the current GRTT probing interval, the source keeps the peak
      round trip estimate from the responses it has received.  The GRTT
      estimate is presently filtered to be conservative towards maintain-
      ing an estimate biased towards the greatest receiver RTT measure-
      ments received.  A conservative estimate of GRTT maximizes the
      efficiency redundant NACK suppression and aggregation.  The update
      to the source's estimate of GRTT is done observing the following
      rules:

        1)   If a receiver's response round trip calculation is greater
             than the current GRTT estimate AND current peak, the
             response value is immediately fed into the GRTT  update fil-
             ter given below.  In any case, the source records the "peak"
             receiver RTT measurement for the current probe interval.

        2)   At the end of the response collection period (i.e. the GRTT


Adamson, Macker            Expires April 2000                  [Page 38]


Internet Draft    The Multicast Dissemination Protocol      October 1999


             probe interval), if the recorded "peak" response is less
             than the current GRTT estimate AND this is the third consec-
             utive collection period with a peak less than the current
             GRTT estimate the recorded peak is fed into the GRTT update.
             (Implicitly, Rule #1 was applied otherwise so no new update
             is required).

        3)   At the end of the response collection period, the peak
             tracking value is set to either ZERO if the "peak" is
             greater than or equal to the current GRTT estimate (i.e.
             Already incorporated into the filter under Rule #1) or kept
             the same if its value is less than the current GRTT estimate
             AND was not yet incorporated into the GRTT update filter
             according to Rule #2. Thus for decreases in the source's
             estimate of GRTT, the "peak" is tracked across three consec-
             utive probe intervals.  The current MDP implementation uses
             the following GRTT update filter to incorporate new peak
             responses into the the GRTT estimate:

           if (peak > current_estimate)
               current_estimate = 0.25 * current_estimate + 0.75 * peak;
           else
               current_estimate = 0.75 * current_estimate + 0.25 * peak;

      This update method is biased towards maintaining an estimate of the
      worst-case round trip delay.  The reason the GRTT estimate is
      reduced only after 3 consecutive collection periods with smaller
      response peaks is to be conservative where packet loss may have
      resulted in lost response messages.  And then the reduction is
      additionally conservatively weighted using the averaging filter
      from above.

      The GRTT collection period (and period of MDP_CMD_GRTT_REQ trans-
      mission) is currently fixed at once per 40 seconds after the source
      startup phase.  During initial source startup the GRTT collection
      period is varied from a short period of 5 seconds up to the steady-
      state collection period of 40 seconds.  An algorithm may be devel-
      oped to adjust the GRTT collection period dynamically in response
      to the current GRTT estimate (or variations in it) and to an esti-
      mation of packet loss.  Thus if the GRTT estimate is stable and
      unchanging, the overhead of probing with MDP_CMD_GRTT_REQ messages
      can be reduced while during periods of variation the GRTT estimate
      might track more accurately with correspondingly shorter GRTT col-
      lection periods.

      In summary, although the MDP repair cycle timeouts are based on
      GRTT, it should be noted that convergent operation of the protocol
      does not strictly depend on accurate GRTT estimation.  The current


Adamson, Macker            Expires April 2000                  [Page 39]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      mechanism has proved sufficient in simulations and in the environ-
      ments in which MDP has been deployed to date.  The estimate pro-
      vided by the algorithm appears to track the peak envelope of actual
      GRTT (including operating system effect as well as network delays)
      even in relaitvely high loss connectivity.  The steady-state prob-
      ing/update interval may potentially be varied to accommodate dif-
      ferent levels of expected network dynamics in different environ-
      ments.

4.6 Automatic Congestion Control

      The MDP design presently has an option for automatic rate-based
      congestion control of source nodes.  The theory of operation is
      loosely based on concepts presented in [12], [13], and [14].  A
      major goal of the congestion control approach is to fairly share
      available network capacity with other ongoing MDP and TCP sessions.
      MDP transmissions are subject to rate control in its fixed-rate
      mode of operation. The approach taken to congestion control auto-
      matically adjusts an MDP source transmission rate according to
      feedback it receives from receiver nodes.

      MDP uses the model of TCP throughput as described in [15] to estab-
      lish a goal for its transmission rate.  This model estimates the
      rate at which a TCP source would transmit given estimates of
      roundtrip delay and delay variation and packet loss.  This TCP
      model can be described by the following equation:

                                  PacketSize
          B = ---------------------------------------------------------
              RTT*sqrt(2bp/3) + T0*min(1,3*sqrt(3bp/8))*p*pow(1+32p, 2)

      where

          B   = Resulting predicted rate in units of bytes per second,

          PacketSize = Nominal transmitted packet size.

          RTT = Estimate of round trip packet delay in seconds,

          p   = Estimate of packet loss fraction,

          T0  = Applicable TCP retransmission timeout,

          b   = Number of packets acknowledged by a TCP ACK.

      The current MDP implementation uses the source "segment_size" plus
      the overhead of an MDP_DATA message for the "PacketSize" parameter.
      Measurements of round trip packet delay ("RTT") and delay variation


Adamson, Macker            Expires April 2000                  [Page 40]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      (TCP's "T0" is a function of "RTT" and delay variation) and esti-
      mates of packet loss ("p") are obtained from receivers within the
      group.  A fixed value of one is assumed for "b" in the current
      implementation.  A goal rate is established by determining the low-
      est rate among the different receivers using the equation above
      with the metrics obtained.  By using this goal rate and a "linear
      increase"/ "exponential decrease" rate adjustment algorithm
      (described in detail below), the MDP congestion control algorithm
      can determine available network capacity and fairly share it with
      TCP or other transport flows (e.g. other MDP flows) with similar
      behavior.  Simulations and limited empirical tests over networks to
      date have been used to validate this approach [16].

      MDP attempts to maintain "worst path" fairness as described in [17]
      even under dynamic conditions by rapidly probing an appropriate
      subset of the receiver set to determine the current worst path
      (according to the above model's predicted rate) receiver.  The
      rapid probing allows MDP to quickly take advantage of newly avail-
      able network capacity and rapidly reduce its transmission rate in
      the face of congestion.  Group size scalability makes it pro-
      hibitive to excite rapid response from the entire receiver set, so
      the source selects a subset of receivers (a default of 5 in the
      current implementation) spanning a dynamic estimate of the most
      significant multicast  topology "bottlenecks".

      The subset of rapidly probed receivers are termed the congestion
      control "representatives".  The composition of the "representative"
      set dynamically changes during the course of source transmission
      based on feedback from the group at large.  The current algorithm
      for selecting and maintaining the congestion control representative
      set is described below.  It is important to note that MDP probing
      and rate adjustment algorithms has features to work through periods
      of intermittent source transmissions and feedback (or lack of) from
      the representative set and group at large (e.g. MDP rapidly reduces
      its rate when its current "bottleneck" representative is unrespon-
      sive to avoid congestion collapse).

      At the present time, encouraging results from simulations and lim-
      ited empirical tests have been obtained using the MDP congestion
      control approach described in this section.  However, multicast
      congestion control is a complex and still emerging area which will
      greatly benefit from further study and investigation.

4.6.1 Source Congestion Control Probing

      This section describes the technique by which the current MDP
      implementation transmits congestion control "probes" in the form of
      MDP_CMD_GRTT_REQ messages.  The receivers respond to these probes


Adamson, Macker            Expires April 2000                  [Page 41]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      with information embedded in MDP_NACK and MDP_ACK messages as
      described previously.  At the present time, this probing is con-
      ducted in separate MDP_CMD_GRTT_REQ messages, but it is possible
      that the probing may be aggregated into selected MDP_DATA or
      MDP_PARITY messages if a corresponding significant reduction in
      protocol overhead can be realized.

4.6.1.1 Congestion Control Startup

      As is often the case, assimilating group state at startup is diffi-
      cult for multicast protocols due to the desire to minimize feedback
      and the corresponding increase in state collection and reaction.
      The startup procedure described here suffers some limitations which
      could be resolved by some "a priori" configuration (e.g. preload
      the representative list with some known "ringers") or other proto-
      col initialization phase.  Startup techniques for multicast conges-
      tion (and multicast group communications in general) merit much
      further study.  However, the approach taken in the current MDP
      implementation (which makes no assumptions about group membership)
      is presented here.

      When the MDP congestion control algorithm is enabled, MDP begins by
      transmitting MDP_CMD_GRTT_REQ messages at intervals of 1.0, 2.0,
      4.0, etc seconds as described for fixed-rate operation startup.
      These initial messages have the MDP_CMD_GRTT_FLAG_WILDCARD flag set
      indicating that all receivers in the group should explicitly
      respond to the MDP_CMD_GRTT_REQ with an MDP_ACK.  The "hold_time"
      in the MDP_CMD_GRTT_REQ is set to 1.0, 2.0, 4.0, etc seconds
      respectively and the receivers should respond to the command in the
      indicated "hold_time" window with a uniform random distribution.
      It is understood that these relatively short "hold_time" values at
      startup _could_ be problematic for cases of large initial group
      sizes with limited network feedback capacity.  The intention of the
      initial rapid wildcard probing with rapid response is to collect
      information so MDP can quickly estimate an appropriate transmission
      rate.  The feedback problem created by this current, interim
      startup algorithm could be potentially solved with side information
      at startup of an appropriate rate or set of congestion control rep-
      resentatives.  Alternatively, an initial, slow startup phase could
      be added to collect this side information prior to actual data
      transmission.  It is also possible that some form of ACK "suppres-
      sion" or router-assisted response aggregation might be realized to
      make "wildcard" probing more scalable.  These issues are under
      investigation.

      4.6.1.2 Steady-state Probing

      Once the MDP source receives any response from the group, it forms


Adamson, Macker            Expires April 2000                  [Page 42]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      and subsequently maintains a list of congestion control representa-
      tives.  At this point, the MDP source begins periodically transmit-
      ting MDP_CMD_GRTT_REQ messages at a rate of once per its estimate
      of GRTT.  Note that "wildcard" probes are transmitted at the same
      rate at which MDP_CMD_GRTT_REQ messages are transmitted for fixed-
      rate operation (currently converging at a steady-state rate of once
      per 40 seconds) with the corresponding "hold_time" value.  (At this
      time, _all_ receivers explicitly respond to "wildcard" probes.
      This may limit scalability to extreme group sizes or network condi-
      tions.  More scalable approaches to "group at large" information
      collection (e.g. response suppression techniques, router assis-
      tance, etc) are under investigation).  The representative list is
      populated and maintained as described in Section 4.6.3.  When the
      representative list is emptied, the MDP source reduces it transmis-
      sion rate and reverts to wildcard-only probing until new congestion
      control representatives are identified.

4.6.2 Receiver Response

      MDP receivers respond to congestion control probing with round trip
      delay timestamps as described in Section 4.5 and an estimate of
      packet loss fraction obtained from monitoring the "sequence" field
      in messages received from the corresponding MDP source.  It is
      important to note that the packet loss measurement technique plays
      an important role in the congestion control algorithm's ability to
      respond to dynamics in network congestion.  MDP receivers currently
      use a form of a filtered, exponential sliding average to estimate
      the current packet loss fraction.  The source advertisement in the
      MDP_CMD_GRTT_REQ message of its current "transmit_rate", bottleneck
      "rtt" and "loss" are used in the packet loss estimator.  (TBD) This
      algorithm will be described in detail in a future version of this
      document.

      These responses to probing are implicit during usual protocol oper-
      ation such as when MDP_NACKs are generated to request repairs or
      MDP_ACK messages are sent in reponse to requests for positive
      acknowledgment.  And when the source sets the
      MDP_CMD_GRTT_FLAG_WILDCARD or the local receiver is a member of the
      advertised representative list, a response is explicitly generated
      within the "hold_time" specified.  It is interesting to note that
      general congestion control operation is completely the responsibil-
      ity of the source.  The receivers only participate in congestion
      control as instructed by the source.  Thus, it would be easy to
      design a distributed application which dynamically enables and dis-
      ables congestion control operation.  Also it might be possible to
      program the receiver implementation such that only designated
      receivers respond to congestion control probing.  This may allow
      increased scalability of the current protocol design through


Adamson, Macker            Expires April 2000                  [Page 43]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      intelligent application configuration.

      The use of "wildcard" probing is being further examined.  It is
      possible that implicit representative "nomination" through normal
      MDP_NACKs alone may be sufficient for successful protocol operation
      in many general cases.  Further simulation and study will be con-
      ducted in this area.

4.6.3 Source Response Processing and Representative Selection

      The TCP analytical model is used to process the responses from
      receivers containing round trip time (RTT) timestamps and packet
      loss measurements.  First, a current measurement of RTT for the
      receiver in question is calculated by:

                  rtt_current = currentTime - "grtt_response"

      If the source has no state for the receiver in question (i.e. it is
      not currently a "representative"), this value is used to initialize
      the estimate of RTT and RTT deviation kept for the receiver.  If
      the receiver _is_ a representative, this value is used to update a
      smoothed estimate of RTT and RTT deviation with the following algo-
      rithm:

           err = rtt_current - rtt_smoothed;
           if (err < 0.0)
           {
               rtt_smoothed += (1.0/64.0) * err;
               rtt_deviation += (1.0/8.0) * (fabs(err) - rtt_deviation);
           }
           else
           {
               rtt_smoothed += (1.0/8.0) * err;
               rtt_deviation += (1.0/4.0) * (fabs(err) - rtt_deviation);
           }

      Note that this algorithm is similar to the algorithm recommended
      for similar state the TCP protocol maintains except that MDP is
      more conservative in reducing the RTT and RTT deviation estimates.
      Examination of statistics from initial simulations of MDP conges-
      tion control have indicated this biased result produces more desir-
      able congestion control behavior.  This bias reduces MDP transmit
      rate fluctuation while maintaining responsiveness to congestion
      indicated by sudden increases in RTT to bottleneck receivers.

      The "rtt_smoothed" value is used for the "RTT" parameter in the TCP
      model and the retransmission timeout value is calculated as:


Adamson, Macker            Expires April 2000                  [Page 44]


Internet Draft    The Multicast Dissemination Protocol      October 1999


                   "T0" = rtt_smoothed + 4.0 * rtt_deviation

      Currently, the congestion control representative list maintained by
      the source is populated with the five receivers with the worst-case
      rates predicted by the TCP model.  The representative with the low-
      est predicted rate is identified as the current "bottleneck"
      (worst-path) representative, and that rate establishes a goal rate
      for the MDP rate adjustment algorithm.  This is a simple algorithm
      for representative election and more complex approaches are being
      coonsidered.  For example, it may be possible to use the loss mea-
      surement and round-trip estimates collected as individual metrics
      to correlate receiver "clusters" sharing a common bottleneck and
      limit the number of representatives per correlation bin.  Then, the
      representative list could be populated with receivers "spread"
      across multiple candidate bottlenecks in the group's multicast
      topology.  This would allow the MDP source to more quickly identify
      the worst-path rate as congestion conditions change.  Simulations
      and studies are being conducted in this area.  It should also be
      noted that MDP receivers' suppression of MDP_NACK messages will
      naturally tend to reduce multiple responses from receiver "clus-
      ters" due to their likely correlated loss and close delay proxim-
      ity.  (Ironically, MDP's FEC repair technique which greatly
      improves protocol efficiency tends to reduce suppression based
      solely on correlated loss patterns which might be a reasonably
      effective identifier of receiver "clusters");

      Representative receivers are removed from the list when they fail
      to respond to "N" consecutive probes.  "N" is a robustness factor
      to account for probe and/or response loss.  The current implementa-
      tion sets "N" to a value of 5.  When receivers are removed from
      this list, this makes room for other candidates and of course pre-
      vious candidates can be quickly restored to the list if responses
      (implicit or explicit) are later received.

4.6.4 Source Rate Adjustment

      The MDP congestion control algorithm can make representative mem-
      bership adjustments on the same interval that probing
      MDP_CMD_GRTT_REQ messages are generated.  At each of these inter-
      vals, the MDP source evaluates its current representative list and
      selects the receiver with the lowest rate as the worst- path "bot-
      tleneck".  This bottleneck rate is used as the goal rate for an
      adjustment in the source current transmission rate.  The source
      also checks that the last response recorded for the bottleneck rep-
      resentative is "current" (i.e.  the "grtt_req_sequence" of the
      MDP_CMD_GRTT_REQ to which the response has a delta of no more than
      one from the "grtt_seq" of the last MDP_CMD_GRTT_REQ transmitted by
      the source.)


Adamson, Macker            Expires April 2000                  [Page 45]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      If the goal bottleneck rate is greater than the current source
      transmission rate and the bottleneck representative response is
      "current", the source linearly increases its transmit rate with the
      following formula:

                  new_rate = old_rate + (1.0 *"segment_size")

      where "new_rate" and "old_rate" are in units of bytes per second
      and the "segment_size" is the MDP source's current "segment_size"
      setting.  Thus the rate is increased by one "segment_size" bytes
      per second every GRTT during the representative probing.  Note that
      the "new_rate" is also limited not to exceed the goal rate pre-
      dicted by the TCP analytical model.

      If the goal bottleneck rate is less than the MDP source's current
      transmission rate or the bottleneck representative's response is
      _not_ "current", the source reduces its rate with the following
      formula:

                          new_rate = old_rate * 0.75

      In this fashion, the rate is exponentially reduced over the course
      of multple GRTT intervals.  Note that if the bottleneck representa-
      tive's response _is_ current, the rate reduction is limited to not
      fall below the goal rate established with the TCP analytical model.

      No rate adjustment is performed when the source is not actively
      transmitting data.  In the current implementation, the MDP source
      (if kept open by the application) continues to probe during this
      time and the representative list contines to be updated.  However,
      it may be appropriate for MDP to gradually reduce its probe rate
      during these periods of data transmission inactivity.  Also it may
      be desirable that the MDP use a reduced rate, weighted by the
      period of inactivity, when resuming transmission.  The main point
      here is that the validity of rate prediction from probing without
      active transmission is questionable, partiularly when probing over
      uncongested connectivity.

4.7 Optional Positive Acknowledgement Process

      MDP provides an option for the source application to request posi-
      tive acknowledgment (ACK) of individual transport objects from a
      subset of receivers in the group.  The list of receivers providing
      acknowledgement is determined by the source application with "a
      priori" knowledge of participating nodes and/or is determined dur-
      ing protocol operation by receivers who indicate their ACKing sta-
      tus request with a flag in the MDP_REPORT messages each node gener-
      ates.  (Note that this second methodology is only applicable when


Adamson, Macker            Expires April 2000                  [Page 46]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      MDP status reporting is enabled). Positive acknowledgment can be
      requested for all transport objects sent by the source or may be
      applied at certain "watermark" points in the course of transmission
      of a series (stream) of transport objects.

      The ACK process is initiated by the source who generates
      MDP_CMD_ACK_REQ messages in periodic "rounds" containing the MdpOb-
      jectTransportId identifier of the object to be acknowledged and a
      list of MdpNodeId identifiers of receiver nodes from which acknowl-
      edgement is being requested.  The ACK process is self-limiting and
      avoids ACK implosion in that:

        1)   Only a single MDP_CMD_ACK_REQ message is generated once per
             (2*GRTT), and

        2)   The size of the list of MdpNodeIds from which ACK is
             requested is limited to a maximum of the source "seg-
             ment_size" setting per round of the positive acknowledgement
             process.

      The indicated receivers randomly spread their MDP_ACK responses
      uniformly in time over a window of (1*GRTT).  As the source
      receives responses from receivers, it eliminates them from the mes-
      sage payload list and adds in pending receiver MdpNodeIds keeping
      within the "segment_size" limitation of the list size.  Each
      receiver is only queried for a maximum number of repeats (20, by
      default).  Any receivers not responding within this maximum robust-
      ness factor are removed from the payload list to make potential
      room for other receivers pending acknowledgement.  The transmission
      of the MDP_CMD_ACK_REQ is repeated until no further responses are
      required or until the robustness threshold is exceeded for all
      pending receivers.  The positive acknowledgment process is inter-
      rupted in response to negative acknowledgement repair requests
      (NACKs) received from receivers during the acknowledgment period.
      The process is resumed once the repairs have been transmitted.
      Note that receivers will not ACK until they have received complete
      transmission of all transport objects up to and including the
      transport object indicated in the MDP_CMD_ACK_REQ message from the
      source.  Receivers will respond to the request with a NACK message
      if repairs are required.  The optional positive acknowledgement
      process may be further refined in future revisions of the MDP pro-
      tocol and has undergone limited operational use to date.

4.8 Buffer Size Considerations

      (TBD)  MDP is designed to allow sources and receivers operate
      within the constraints of limited buffering resources.  A complete
      discussion of issues related to buffering resources will be


Adamson, Macker            Expires April 2000                  [Page 47]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      provided in the future.

4.9 Silent Receiver Operation

      MDP supports an option for "silent" receiver, or emission con-
      trolled (EMCON) receiver operation.  This mode of operation is use-
      ful when it is not possible (or desired) for the receiver nodes to
      transmit, by any means, messages back to the source node.  The
      "auto_parity" feature of the source transmission sequence can be
      leveraged to provide robust but efficient delivery of data with the
      robustness tuned to the expected packet loss characteristics of the
      network media.  Additionally, receivers can combine the information
      from multiple repeat transmissions of transport object data into a
      complete object.

      The primary issues with implementation of this mode of operation is
      with how state and memory is managed at the receiving receiver
      nodes.  In particular, the receivers must have a policy of when to
      "give up" on reception of an object and to free resources for
      reception of subsequent transmissions.  A simple policy was imple-
      mented in earlier versions of the MDP protocol.  This is currently
      being refined in the current MDP implementation to allow support
      for different, user-defined concepts of silent receiver operation.

      (TBD)  A complete discussion of different concepts of silent
      receiver operation will be provided in the future.

4.10 Statistics Reporting

      (TBD) A complete disscussion of the session performance statistics
      reports generated by  participating nodes will be provided in the
      future.

5.0 Security Considerations

      At the time of writing, broad multicast security issues are the
      subject of research within the IRTF Secure Multicast Group (SMuG).
      Solutions for these requirements will be standardized within the
      IETF when ready.  However, the current protocol does not preclude
      the use of application security mechanisms or the use of underlying
      network security features.  For example, the current protocol
      implementation has an option to use sockets secured with an under-
      lying IPSec implementation on operating systems supporting that
      feature.

6.0 Future Work and Design Issues

      While the present design has been through limited MBone and other


Adamson, Macker            Expires April 2000                  [Page 48]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      operational network use and has been shown to work effectively,
      there remain design issues which the authors envision will continue
      to evolve.  These include multicast session startup and the dynamic
      congestion control algorithm.  We feel these are general problems
      and not unique to MDP.  While effective reactive congestion control
      in a multicast environment remains a complex technical design
      issue, the basic rate control feature in the present design can be
      activated in number of environments within the limitations
      described in this document.

      We envision potential advantage in applying this protocol framework
      in combination with a reservation protocol (e.g., RSVP [18]) and
      future integrated or differential services capabilities.  The
      source rate control setting can be reflective of the bandwidth
      reserved and protocol timers can be better tuned to operate within
      average or upper bound delay expectations.

      Also, early simulation results show that the MDP congestion control
      approach is effective in allowing multiple multicast flows to
      dynamically share available capacity.  Then unicast and multicast
      traffic isolation methods may be applied in some scenarios in com-
      bination with end-to-end congestion control.  Additionally, early
      study of the application of intermediate system random early detec-
      tion (RED/ECN) explicit congestion notification [19, 20] or other
      network congestion indicators have been shown to be additionally
      beneficial to the overall performance of the end-to-end multicast
      congestion control approach used  by MDP.

7.0 Suggested Usage

      The present MDP instantiation is seen as useful tool for the  reli-
      able data transfer over generic IP multicast  services.  It is not
      the intention of the authors to suggest it is suitable for  sup-
      porting all envisioned multicast reliability requirements.  MDP
      provides a simple and flexible framework for multicast applications
      with a degree of concern for network traffic implosion and protocol
      overhead efficiency.  As previously described, MDP has been suc-
      cessfully demonstrated within the MBone for bulk data dissemination
      applications, including weather satellite compressed imagery
      updates servicing a large group of receivers and a generic web con-
      tent reliable "push" application.

      In addition, this framework approach has some design features mak-
      ing it attractive for bulk transfer in asymmetric and wireless
      internetwork applications.  MDP is capable of successfully operat-
      ing independent of network structure and in environments with high
      packet loss, delay, and misordering.  Hybrid proactive/reactive
      FEC-based repairing improve protocol performance in some multicast


Adamson, Macker            Expires April 2000                  [Page 49]


Internet Draft    The Multicast Dissemination Protocol      October 1999


      scenarios.  A source-only repair approach often makes additional
      engineering sense in asymmetric networks.  MDP's optional unicast
      feedback mode may be suitable for use in asymmetric networks or in
      networks where only unidirectional multicast routing/delivery ser-
      vice exists. Asymmetric architectures supporting multicast delivery
      are likely to make up an important portion of the future Internet
      structure (e.g., DBS/cable/PSTN hybrids) and efficient, reliable
      bulk data transfer will be an  important capability for servicing
      large groups of subscribed receivers.

8.0 References

  [1]   S. Deering. "Host Extensions for IP Multicasting". Internet RFC
        1112, August 1989.

  [2]   A. Mankin, A. Romanow, S. Bradner, V. Paxson, "IETF Criteria for
        Evaluating Reliable Multicast Transport and Application Proto-
        cols", RFC 2357, IETF, June 1998.

  [3]   J. Macker, W. Dang, "The Reliable Dissemination Protocol", Inter-
        net Draft, October 1996, work in progress.

  [4]   Metzer, John, "An Improved Broadcast Retransmission Protocol",
        IEEE Transactions on Communications, Vol. Com-32, No.6, June
        1984.

  [5]   J. Macker, "Integrated Erasure-based Coding for Reliable Multi-
        cast Transmission", IRTF RMRG Meeting presentation, March 1997,
        <http://tonnant.itd.nrl.navy.mil/ipresearch/rmfec/intfec_mem-
        phis.ps.gz>.

  [6]   J. Macker, "Reliable Multicast Transport and Integrated Erasure-
        based Forward Error Correction", Proc. IEEE MILCOM 97, October
        1997.

  [7]   D. Grossink, J. Macker, "Reliable Multicast and Integrated Parity
        Retransmission with Channel Estimation", IEEE GLOBECOM 98, 1998.

  [8]   B.N. Levine, J.J. Garcia-Luna-Aceves, "A Comparison of Known
        Classes of Reliable Multicast Protocols", Proc. International
        Conference on Network Protocols (ICNP-96), Columbus, Ohio, Octo-
        ber 29- November 1, 1996.

  [9]   H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson. "RTP: A
        Transport Protocol for Real-Time Applications", RFC 1889, IETF
        January 1996.


Adamson, Macker            Expires April 2000                  [Page 50]


Internet Draft    The Multicast Dissemination Protocol      October 1999


[10]   S. Floyd, V. Jacobson, S. McCanne, C. Liu, and L. Zhang. "A Reli-
        able Multicast Framework for Light-weight Sessions and Applica-
        tion Level Framing".  In Proc. ACM SIGCOMM, pp. 342-256, August
        1995.

[11]   Lin S, D.J. Costello, "Error Control Coding", Prentice Hall,
        1983.

[12]   M. Handley and S. Floyd, "Strawman Specification for TCP Friendly
        (Reliable) Multicast Congestion Control (TFMCC), ISI/LBNL Techni-
        cal Report for the IRTF RMRG working group, November 1999,
        <http://www.aciri.org/mjh/rmcc.ps>.

[13]   D. DeLucia and K. Obraczka, "Multicast Feedback Suppression Using
        Representatives", USC/ISI Technical Report, June 1996.

[14]   D. DeLucia and K. Obraczka, "A Congestion Control Mechanism for
        Reliable Multicast", IRTF RMRG Meeting presentation, September
        1997, <http://www.east.isi.edu/rm/delucia.ps>.

[15]   J. Padhye, V. Firoiu, D. Towsley and J. Kurose, "Modeling TCP
        Throughput: A Simple Model and its Empirical Validation", Proc.
        ACM Sigcomm, 1998, Vancouver, Canada.

[16]   B. Adamson, "MDP Congestion Control Update", IRTF RMRG Meeting
        presentation, June 1999, <ftp://mani-
        mac.itd.nrl.navy.mil/Pub/MDP/mdp_update.ps.gz>.

[17]   B. Whetton and J. Conlan, "A Rate Based Congestion Control Scheme
        for Reliable Multicast", Technical Report, GlobalCast Communica-
        tion, October 1998.

[18]   R. Braden, Ed., L. Zhang, S. Berson, S. Herzog, and S. Jamin,
        "Resource ReSerVation Protocol (RSVP) -- Version 1 Functional
        Specification", RFC 2205, IETF, September 1997.

[19]   S. Floyd and V. Jacobson, "Random Early Detection Gateways for
        Congestion Avoidance", IEEE/ACM Transactions on Networking, V.1
        N.4, August 1993, pp.  397-413.

[20]   S. Floyd and K. Fall, "Router Mechanisms to Support End-to-End
        Congestion Control", LBL Technical Report, February 1997.


Authors' Addresses

           R. Brian Adamson
           Newlink Global Engineering Corporation


Adamson, Macker            Expires April 2000                  [Page 51]


Internet Draft    The Multicast Dissemination Protocol      October 1999


           6506 Loisdale Road, Suite 209
           Springfield, VA 22150
           (202) 404-1194
           adamson@itd.nrl.navy.mil

           Joseph Macker
           Information Technology Division
           Naval Research Laboratory
           Washington, DC 20375
           (202) 767-2001
           macker@itd.nrl.navy.mil


Adamson, Macker            Expires April 2000                  [Page 52]