RMT Working Group M.Luby/Digital Fountain Internet Engineering Task Force J.Gemmell/Microsoft INTERNET-DRAFT L.Vicisano/Cisco draft-ietf-rmt-pi-alc-01.txt L.Rizzo/U. of Pisa 13 July 2000 J. Crowcroft/UCL B. Lueckenhoff/Cadence Expires 13 January 2000 Asynchronous Layered Coding A massively scalable reliable multicast protocol Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as a "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document describes Asynchronous Layered Coding, a massively scalable reliable multicast protocol, hereafter referred to as ALC. ALC can be used to reliably deliver content to multiple receivers. The content can be any well-defined object. Examples include any type of file such as a group of pictures in an MPEG stream, a MP3 music file, a JPEG image, and a collection of files that are archived into one file. In addition, the delivery service model that can be provided with ALC is fairly flexible, e.g., an on demand service model and a push service model. A session is initiated by a single sender to transmit packets that contain data about a particular object. Receivers may join or Draft RMT PI, Asynchronous Layered Coding 13 July 2000 leave an existing session in an asynchronous manner independent of other receivers. Reliability is achieved via FEC coding only, i.e. all packets in a session contain FEC coded information about the object to be delivered. The crucial point is that there is no request for retransmission of individual packets from receivers that miss packets in order to assure reliable reception of the object, and the packets and their rate of transmission out of the sender are independent of the number and the individual reception experiences of the receivers. In some delivery service models, e.g, a push delivery model, it is appropriate that receivers send messages to the sender (either in or out of band) to either continue the session if receivers have not yet received enough packets or to terminate the session when receivers have received enough packets. To be able to track usage of the system, receivers can also send messages out of band to the sender that contain statistics on their reception experience. ALC, however, does not require nor specify such messages, with the aim of avoiding unnecessary limitation to the scalability of the protocol. Congestion control is achieved by sending packets in the session to several ALC groups. Receivers increase and decrease their session reception rate in reaction to network conditions by joining and leaving ALC groups associated with the session in a manner that is network friendly, similar to how TCP is network friendly. The congestion control algorithm is receiver driven, i.e., signals are placed into packets by the sender to indicate to receivers how to react to changing network conditions, and receivers adjust their reception rate in response to these signals and packet loss. Thus, each receiver experiences a reception rate appropriate to that receiver independent of other receivers. ALC has the following properties: o To each receiver, it appears as if though there is a dedicated unicast session from the sender to the receiver, where the reception rate adjusts to congestion along the path from sender to receiver in a manner similar to TCP. o To the sender, there is no difference in load or outgoing rate if one receiver is joined to the session or a million (or any number of) receivers are joined to the session, independent of when the receivers join and leave. Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 2] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 o On each link in the network, the packet traffic from the session and its reaction to competing traffic is the same whether there is one receiver or a million receivers beyond the link, and this reaction is similar to how a TCP session reacts. Thus, ALC provides a massively scalable content delivery system that is network friendly. 1. Introduction This document describes a massively scalable reliable protocol for content delivery using IP multicast. IP multicast, while powerful and efficient, is a "best effort" service [DEE88]. It does not guarantee packet reception, or reception order. Many reliable multicast protocols have been built on top of multicast. However, scalability is not a design goal for many reliable multicast protocols. Even among those reliable multicast protocols that target improved scalability, many still fall far short of the scalability of IP multicast itself. Others, while as scalable as IP multicast, require changes to routers or other infrastructure, making their deployment unlikely in the near term. One of the key difficulties in scaling reliable multicast is dealing with the amount of data that flows from receivers back to the sender. Protocols that avoid any such feedback can be massively scalable. The data carousel [AFZ95] approach is a simple protocol that avoids any feedback from the receivers. The sender simply loops repeatedly through the symbols of the object to be delivered, places the symbols into packets, and transmits the packets (a symbol is a small fixed size portion of data). If the receiver misses any symbol, it simply waits for the symbol to be sent again in the next loop. However, a receiver has to wait for the full loop to repeat to receive a symbol that is missed if the packet carrying it is lost. Forward error correction (FEC) coding can be used to improve the data carousel approach [RIZ97a], [GEM99], [VIC98A], [BYE98], [LUB00]. Using a FEC codec, i.e., a FEC encoder at the sender to generate packets from an object and the corresponding FEC decoder at the receivers to process packets in order to recover the object, can dramatically reduce the number of packets a given receiver needs to receive in order to recover the object. ALC relies exclusively on a FEC codec to achieve reliability, thereby avoiding non-scalable reliability mechanisms that rely on receiver feedback to the sender to trigger retransmission of missing packets. A description of FEC codes with packet content specification and considerations for their use in reliable IP multicast can be found in [FEC00]. This document utilizes the terminology and concepts introduced in it. Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 3] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 An attractive feature of ALC is the ability for different receivers to join and leave a session and ALC groups within the session asynchronously without adversely affecting the reception experience of other receivers and without affecting the scalability of the protocol. ALC congestion control is achieved by sending packets associated with a given session to several ALC groups and individual receivers only join to a subset of these groups. The set of ALC groups a receiver joins is determined by the receiver based on signals placed into packets by the sender and by loss measured by the receiver along the path from the sender to that receiver. Receivers that can receive packets at a rate higher than their current rate are allowed to increase their reception rate, and receivers that are receiving packets at a higher rate than they have the capacity for (as evidenced by packet loss) MUST reduce their rate. A detailed description of this type of multi-rate congestion control can be found in [MRCC00]. Another possibility to implement congestion control is through a router- assisted scheme where the selection of which ALC groups to forward is performed by routers. Having routers select which groups to forward allows finer grain congestion control and a faster reaction to network congestion. A limitation of this approach is that it potentially requires changes to the hardware router infrastructure. See [CAI99, LUB99] for a preliminary design description. We do not discuss this approach further in this document. One of the attractions of ALC is that it is multicast routing independent and that it does not require multicast reverse connectivity, i.e. ALC receivers do not send multicast traffic. In particular, ALC works with the original multicast model introduced in [DEE88], which we call Internet Standard Multicast (ISM) in this document, and with the Source Specific Multicast (SSM) model that is based on [HOL99]. The definition of an ALC group used throughout this document is slightly different with ISM and with SSM. When using ISM, packets of an ALC group are sent to a multicast group address G. When using SSM, packets of an ALC group are sent to a channel address (S,G), where S is the IP address of the sender and G is a multicast group address. SSM is more attractive to ALC than ISM for a few reasons. First, ALC may use more than one ALC group in a session, and ALC may be used to deliver a large number of objects in different sessions over time that use different sets of ALC groups for the transmission. With ISM, the multicast group address G that corresponds to an ALC group must be allocated so that it is unique across the Internet. With SSM, the multicast group address G can be allocated locally by the sender with Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 4] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 the only requirement that it is unique to the sender, because it is the (S,G) channel that corresponds to the ALC group that a receiver joins. Second, ALC supports an unlimited number of dynamically changing receivers. Changes in the multicast tree topology with SSM are light weight operations (a new branch from the receiver towards S grows when a receiver joins, and the branch is deleted when the receiver leaves), and with ISM changes can be heavier weight (involving transitions from a (*,G)-tree rooted at an RP to the tree rooted at S). Third, ALC is scalable to an unlimited number of receivers that may span the global Internet. Thus, the light weight mechanisms that SSM uses to cross ISP boundaries (standard BGP+ routing tables) is distinct advantage over the heavier weight mechanisms used by ISM (the MSDP and BGMP protocols, both of which are not needed by SSM). Finally, a receiver joins an ALC group by joining a channel (S,G) with SSM, and thus the receiver will only receive packets sent from the sender S. With ISM, the receiver joins an ALC group by joining a multicast group G, and all packets sent to G, regardless of their origin sender, will be received by the receiver. Thus, SSM has compelling security advantages over ISM for prevention of denial of service attacks. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [R2119]. 2. Related Documents This specification is based on the "Forward Error Correction" building block specified in [FEC00] and on the "Multi-Rate Congestion Control" building block ([MRCC00], yet to be specified). ALC can use any FEC coder that complies to the specifications in [FEC00] and that is either specified in [FEC00] or in one of its companion documents. ALC refers to specifications and general description provided in [FEC00]. ALC uses FEC without feedback from receivers, in the 'proactive' form described in [FEC00]. The present document complements [FEC00] by providing an actual instantiation header-fields that are described in abstract terms in [FEC00]. ALC does not specify congestion control directly, but relies in the specification given in [MRCC00]. This specification of ALC reserves opaque header fields that can be used by [MRCC00] to transport information related to congestion control. Implementors of ALC MUST also implement [MRCC00] or an equivalent that provide congestion control in accordance to RFC2357 ([MRBP98]). Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 5] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 3. Applicability ALC is intended for reliable delivery of objects. ALC is most applicable for delivery of objects of substantial length, i.e., objects that range in length from hundreds of kilobytes to many gigabytes. ALC is directly applicable to all multicast enabled networks, including asymmetric networks, wireless networks, and satellite networks. Thus, the inherent raw scalability of ALC is unlimited. However, when other specific applications are built on top of ALC, then these applications by their very nature may limit scalability. For example, if an application requires receivers to request out of band information in order to join a session, or an application allows receivers to send requests back to the sender in order to extend an ongoing session, then the scalability of the application is limited by the ability to send, receive, and process this additional data. The particular FEC coder and congestion control protocols used by ALC to provide a complete protocol have an impact on the applicability of ALC. For example, some FEC coders are inherently limited in the size of the object they can encode, and for objects larger than this size the reception overhead on the receivers can grow substantially. As another example, some networks are not amenable to some congestion control protocols that could be used with ALC. In particular, for a satellite or wireless network, there may be no mechanism for receivers to effectively reduce their reception rate since there may be a fixed transmission rate allocated to the session. 4. General Architecture An object transmission session comprises all packets sent to ALC groups from a single sender that pertain to the transmission of a particular object that could potentially be received by a receiver to recover the object. For example, packets pertaining to a particular object could be transmitted from a sender to four ALC groups at different rates. A receiver may join and receive packets from subsets of these four groups concurrently until it has enough packets in total to recover the object. ALC allows for the more general possibility that several senders are each concurrently transmitting packets to a session for the same object. In this case, a receiver may concurrently join several of these sessions in order to receive the object. Since the senders for these sessions may be located in varying parts of the network, the receiver must perform congestion control on each session independently, although the receiver can take advantage of the aggregate set of packets that arrive Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 6] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 through all sessions to reconstruct an object. For example, there could be four senders, each using three ALC groups for a session pertaining to the same object. A receiver could join all four sessions and collect enough packets to reconstruct the object, but the receiver must perform congestion control independently for each of the four sessions. This document does not specify this mode of operation, assuming that it can be implemented by multiple concurrent ALC sessions. Note however that additional coordination among senders is recommended in order to achieve good reception overhead (see [FEC00]). The receivers need to obtain an object transmission session description before starting the download of an object. The session description must include the relevant session parameters needed by a receiver to enact a download of an object from the senders participating in the session. The object transmission session description is determined and agreed upon by the senders and communicated to the receivers out of band, or, in some cases, included or partially included in the header of each packet. The session description could include the object name, the object length, the packet format and length, the FEC codec type, the sender IP address, the multicast address(es) of the multicast groups, their port number(s), and transmission rates. The session description could be in a form such as SDP [HAN98]. We assume that there exists an out of band mechanism for receivers to obtain the session description. The session description might be carried in a session announcement protocol such as SAP [HAN96], located on a Web page with scheduling information, or conveyed via E-mail or other out of band methods. Discussion of session description format, and distribution of session descriptions is beyond the scope of this document. An FEC encoder is used to generate encoding symbols that are placed in the payload of ALC packets. A description of FEC codecs can be found in [FEC00], and the current document relies upon and uses the terminology introduced in it. The FEC coding information is information that needs to be passed to a receiver in order to decode objects. FEC coding information includes the FEC codec type, the source block length, the symbol length, the object length, the encoding block number, the encoding symbol ID, and an encoding flag indicating whether the encoding symbol is a source symbol or a redundant symbol for block codes. The FEC protocol ID specifies the FEC codec type and the way it is to be used for the transmission (see delivery service models below). The object length, source block length and the symbol length are part of FEC object transmission information. The encoding block number (if used) and the encoding symbol ID that identify the encoding symbol in the payload of an ALC packet constitute the FEC payload ID. Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 7] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 Congestion control information must be included in the ALC packet header. The content of this information depends on the type of congestion control used, and [MRCC00] or an equivalent must be used. The type of congestion control to use is also encoded in the ALC header. Some specific congestion control algorithm(s) are defined in [MRCC00]. All ALC packets pertaining to a particular object transmission session MUST have the same format. An ALC packet consists of an ALC header and a payload. Together with other information, the ALC header MUST contain congestion control information, an FEC protocol ID, and an FEC payload ID. Receivers use congestion control information to coordinate the rate at which they receive packets and to measure congestion as indicated by packet loss in order to determine this rate. Other OPTIONAL information that an ALC header may contain is an object transmission label, FEC session information, FEC object transmission information, and authentication information. The object transmission label can be used by receivers to verify that received packets are associated with a particular object transmission session. Therefore, object transmission labels for sessions pertaining to different objects should be different. As an example, the object transmission label may be the MD5 hash of the object name, object length and FEC object transmission information described below. As another example, the object transmission label may be the MD5 hash of the object itself. The ALC packet format described in this document is intended to be used in conjunction to the UDP transport protocol [POST80]. Future version of this document or companion documents MAY specify a different encapsulation for ALC. 5. Minimizing reception overhead The primary purpose of using FEC codes is to ensure that minimal number of packets need be received in order for a receiver to reconstruct an object. Reception overhead is used to measure this. Reception overhead is the aggregate length of packets needed to recover the object beyond the object length, measured as a percentage of the object length. For example, if it takes 15 MB of packets in order to recover a 10 MB object, then the reception overhead is (15-10)/10 times 100, or 50%. The minimal reception overhead possible is 0%. Under ideal conditions, reception overhead is 0% using even the simplest of FEC codes. For example, a simple carousel achieves 0% reception overhead when transmission is over a network with no packet loss using a Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 8] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 single ALC group with receivers that join and stay attached to the group until they have enough encoding symbols to recover the object. However, under more realistic conditions when transmission uses multiple ALC groups, when there is packet loss, and when receivers join and leave groups during the download process, more sophisticated FEC codes are clearly useful to minimize reception overhead. There are three primary contributors to reception overhead, and these guide the design of how to use FEC codes for ALC. These contributors are: (1) redundant encoding symbol reception due to division of the object into blocks; (2) duplicate encoding symbol reception due to fixed length FEC codes; (3) inherent reception overhead of the FEC code. ALC based on small block codes is prone to (1) and (2), ALC based on large block codes is most prone to (2) and (3), and ALC based on large expandable codes is most prone to (3). 5.1. Division into blocks One concern is the order of encoding symbol transmission from different blocks when the object is partitioned into blocks. Suppose the object is partitioned into m blocks of k source symbols each, and any k encoding symbols from a block is sufficient to recover the k source symbols from that block. Ideally, reception of any km encoding symbols is sufficient to recover the entire object. Actually, reception of k encoding symbols for each of the m blocks is necessary to recover the entire object. Thus, it is important to schedule the transmission of symbols so that in the face of most patterns of packet loss receivers can recover the object from reception of as close to km encoding symbols as possible. A RECOMMENDED ordering is to interleave encoding symbols from different blocks. This interleaving can be described in terms of rounds, where in a round m encoding symbols are transmitted, one from each block. A particular sending order in each round is not specified by ALC. However, it is RECOMMENDED that some randomization be performed to eliminate correlation with periodic losses. For example, sending could be performed as follows: In round i, randomly choose a permutation p(i) of the m blocks, and then send the i-th encoding symbol from each of the m blocks in the order determined by p(i). In the following example, the object is split into 3 blocks of 4 source symbols each: Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 9] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ + s00 + s01 + s02 + s03 + s10 + s11 + s12 + s13 + s20 + s21 + s22 + s23 + +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ Source block 0 | Source block 1 | Source block 2 Then the first 4 permutations chosen are p(0) = (1,0,2), p(1) = (0,2,1), p(2) = (2,0,1), p(3) = (0,1,2), and the send order for the first 4 rounds is: +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ + e10 + e00 + e20 + e01 + e21 + e11 + e22 + e02 + e12 + e03 + e13 + e23 + +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ Encoding symbol send order Object division into blocks, and thus application of this scheme, is applicable to all the codes, as the value of k is ultimately limited for all the codes. For example, small block codes use small values of k because of computational limits, and large block codes and large expandable codes use much larger yet still limited values of k because of memory and packet length limitations. This interleaving scheme minimizes the induced reception overhead due to division into blocks for any predefined loss pattern for a given value of k. However, it is likely that the induced reception overhead will be larger and more variable when k is small and when losses are moderate or more. For example, when k = 20 and m = 50, the induced reception overhead with respect to a 10% random packet loss is on average 18%, and will reach over 40% for some receiver among 1,000 receivers. See [BYE98] for a more detailed analysis of this. Thus, the reception overhead will tend to be smaller and less variable when using either large block codes or large expandable codes than when using small block codes. For small block codes and large block codes the limit on the number of encoding symbols n for the codes will limit the number of distinct rounds to n. For large expandable codes, since there is no limit to the number of encoding symbols there can be an unlimited number of rounds. A simple way to choose the permutation in each round is the following, and this is RECOMMENDED when protecting against arbitrary packet loss patterns. At each round, the permutation p(i) is determined by selecting a first block randomly, and the rest of the ordering is the consecutive blocks following this first block, modulo the number of Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 10] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 blocks. For example, if there are m blocks, and block j is selected as the first, the permutation p(i) consists of block j through block m-1, followed by block 0 through block j-1. For more detailed discussion, see [VIC98B,GEM99]. 5.2. Block FEC codes For some FEC codes for a given object consisting of s source symbols there is a limited supply t of encoding symbols that are produced. If the number of encoding symbols to be sent exceeds t then some encoding symbols need to be sent more than once. Thus, it is important to schedule the order of sending encoding symbols in such a way as to avoid as much as possible reception of the same encoding symbol more than once by the same receiver. This applies to ALC based on block codes as these codes produce a limited number of encoding symbols. However, this does not apply to ALC based on large expandable codes, as these codes can produce an essentially unlimited supply of encoding symbols. 5.2.1. A single group In some simple scenarios, receivers will join a single ongoing ALC group, collect enough encoding symbols to decode the object, and then leave the object transmission session. In this situation, in order to avoid as much as possible duplicate reception of packets by such receivers, it is important to send the encoding symbols the second and subsequent times in the same order that they were sent the first time. Thus, for example, a receiver that joins the session towards the end of the transmission of the encoding symbols for the first time can continue receiving packets from the transmission of the encoding symbols for the second time and be sure to receive distinct encoding symbols. However, it is inevitable that the reception overhead due to reception of duplicate encoding symbols can be large for receivers that join and leave the transmission over intervals of time that span the transmission of more than the total supply of encoding symbols. For example, a receiver may join the session at the beginning of the first transmission of encoding symbols and receive e0, e1, ... , e98, e99, leave the session and then rejoin the session again at the beginning of the second transmission of encoding symbols and receive again e0, e1, ... , e98, e99, thus receiving 100 duplicate encoding symbols that provide no benefit to recovering the object. Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 11] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 +-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+ + e0 + e1 + ... + e98 + e99 + ... + e0 + e1 + ... + e98 + e99 + +-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+ first transmission | second transmission 5.2.2. Multiple groups In more general scenarios, in order to recover an object, receivers will join multiple ongoing ALC groups dynamically. These groups may emanate from more than one server and may be running at different rates. Furthermore, receivers may join a given group and rejoin the same group an arbitrary amount of time later (for example, the next day) to complete the recovery. Finally, for congestion control purposes, receivers dynamically change the set of ALC groups they are receiving from based on dynamically changing loss conditions. How to schedule the encoding symbols from a block code among the various ALC groups in order to minimize reception overhead under all of these different conditions is a challenge. Let r = t/s be the ratio of the number of encoding symbols to the number of source symbols. The larger the value of r the easier it is to avoid receiver reception of duplicate encoding symbols, and in particular as r goes to infinity then reception overhead due to this problem can go to zero. There are ordering schemes that keep reception overhead minimal for small value of r under certain restrictions. For example, in the previous subsection a scheme is described that minimizes reception overhead under the restriction that there is only one group and the receiver is able to complete recovery within a time period that encompasses the transmission of t encoding symbols. However, under general conditions the best and simplest scheme for minimizing reception overhead for objects that aren't blocked is the following. For each group, organize the sending of encoding symbols into rounds. In each round, choose independently a random permutation of all t encoding symbols and send the encoding symbols in this order. Using this ordering, it is not hard to show that no matter in which order and at what time a receiver may join and leave groups the induced reception overhead due to reception of duplicate encoding symbols of a receiver is at most on average r*ln(r/r-1)-1. Furthermore, there are examples of receiver behavior that achieves these maximum averages. As examples, when r is two then the induced reception overhead is at most 38.6%, when r is five then the reception overhead is at most 11.5%, and when r is ten then the reception overhead is at most 5.4%. Thus, to achieve a Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 12] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 reasonable overhead the total length of the encoding must be substantially longer than the length of the object. 5.3. Inherent overhead A (n, k) linear block code has no inherent reception overhead, as any k encoding symbols are sufficient to recover all k source symbols. On the other hand, both the large block codes and large expandable codes described in [FEC00] do have a small amount of inherent reception overhead. 6. Delivery service models ALC can support several different delivery service models. Some examples are briefly enumerated here. On demand delivery model. Receivers may join the ongoing object transmission session at their discretion, obtain the necessary encoding symbols to reproduce the object, and then leave the session. For an on demand service model senders typically transmit for some given time period selected to be long enough to allow all the intended receivers to join the session and recover the object. The time period would typically be much larger than the download time for the object to make it convenient for receivers to enact the download at their discretion. For example a popular software update might be sent using ALC for several days, even though a receiver may be able to complete the download in one hour total of connection time, perhaps spread over several intervals of time. Push service model. This is a variant of the on demand delivery model, with the difference that the transmission is intended for a set of loosely synchronized receivers. Receivers join the session at approximatively the same time the sender start the transmission (one way to to this is to have the sender send the required out of band information about the session to the designated list of receivers, and this triggers the receivers to join the session to start the download). Then receivers provide feedback about the status of reception in order to allow the sender to keep the session alive until the last receiver has finished. This specification describes an OPTIONAL lightweight scalable mechanism for receivers to provide in-band feedback in ALC. Other out of band mechanisms can be used, including those that rely on explicit knowledge of the session participants and demand that all of them send explicit notification of the successful reception. Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 13] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 Streaming service model. Typically, receivers join and remain joined to a particular set of ALC groups to receive multiple related objects in consecutive object transmission sessions. For a streaming service model senders typically transmit a fixed number of encoding symbols for each object. This number may depend on network conditions that are obtained using out of band methods. For example, if network conditions are such that at most 33% of the packets are lost over any 15 MB of transmitted packets, then 15 MB of encoding symbols may be transmitted for a 10 MB object to ensure that receivers are able to reassemble the object under the worst loss conditions. A typical application is a streaming real- time MPEG video, where each object consists of a segment of the stream. For each object, the receiver obtains enough ALC packets to recover the object and then discards all subsequent packets associated with the object until packets start arriving for the next object. There are many other delivery service models that ALC can be used for that are not covered above. The description of the many potential applications and the appropriate delivery service model is beyond the scope of this document. With many of these delivery service models substantial additional mechanisms beyond what is described in this document will be needed to support the delivery service model, i.e. the out of band mechanism for delivering object transmission session information to the receivers. This document only attempts to describe the minimal common scalable elements to these diverse applications using ALC as the delivery mechanism. 7. Congestion Control A congestion control algorithm for ALC is specified in a companion document [MRCC00]. 8. Packet Content ALC defines two types of packets: a Data Packet and Request Packet. In this specification of ALC both these packets are transmitted as UDP payload [POST80]. Future versions of this specification or companion documents may remove this limitation. Data packets are sent by the sender(s) to a multicast IP destination address. Request packets are sent by receivers to the unicast IP destination address of the sender (one of them, if there is more than one) or to the unicast address of a session-control node. ALC does not strictly require the presence of Request packets, the only purpose of these is to implement the OPTIONAL "transmission extension" mechanism. Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 14] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 When present, the ALC payload immediately follows the ALC header. In the ALC header, all integer fields are carried in "big-endian" or "network order", that is, most significant byte (octet) first. Bits designated as "padding" or "reserved" MUST by set to 0 by senders and ignored by receivers. Unless otherwise noted, numeric constants are in decimal (base 10). 8.1. Data-Packet content The ALC data-packet header contains the following types of information provided by the sender(s): o Congestion Control Information (mandatory) Used by receivers (or intermediate nodes) to implement congestion control algorithms. o FEC Information This is the instantiation of the abstract fields defined in [FEC00]. This class is further decomposable as follows: o FEC Encoding Identifier (mandatory) Identifies the type of FEC coding scheme being used. o FEC Object Transmission Information (optional) Identifies the parameters associated with the encoding of an object through the FEC coding scheme in use. o FEC Payload Identifier (mandatory) Identifies the content of the data packet for the purpose of decoding. o FEC Encoding Flag (mandatory) Identifies packets containing only source symbols o Object Transmission Label (optional) Used by receivers to de-multiplex different objects being transmitted in a single session and possibly to associate them to Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 15] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 their description provided out of band. o Source Authentication Information (optional) Used by receivers to authenticate the content of the packet. o Object Transmission Status (optional) Used by receivers to implement the OPTIONAL "transmission extension" mechanism. The actual packet format is depicted in Figure 1 below. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Congestion Control Information | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | V | HDR_LEN |FEC Encoding ID| CCID|E| resvd |F|TLL|TIL|R|A|X| +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | FEC Payload ID (1-:-2 32-bit words) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Object Transmission Label (0,1,2 or 4 32-bit words) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | FEC Object Transmission Information (0-:-3 32-bit words) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |C|r| Residual Object Transmission Time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Authentication Information (variable length) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1 - ALC Data-packet header layout The ALC data-packet header is made of a fixed part, 64 bits long, followed by a variable part. The fixed part contains congestion control information, general protocol settings and information describing the variable part of the header. The variable part is composed by one or more variable-length fields. The presence of each of those field is determined by the description provided in the fixed part of the header. The length of each field is either constant or determined in the fixed part of the header or defined in the field itself. Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 16] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 Optionally an ALC data-packet header can contain additional header-extension fields intended both for protocol extensions and as a mechanism to extend the length of some of the fields already present in the header. Header extensions fields are described in a separate section. The explanation each field in the packet header is the following. Congestion Control Information (CCI): 32 bits Used for Congestion Control purposes. This field is opaque for the purpose of this specification. [MRCC00] defines its format and usage according to the value of the CCID field. ALC version number (V): 3 bits The ALC version number for this specification is 0. Non-fixed ALC header Length (HDR_LEN): 5 bits Length of the non-fixed portion of the ALC header in units of 32-bit words, starting form the 3rd 32-bits word in the ALC header up to the ALC payload (including header extensions, see below). I.e. this is the total header length excluding the first two 32-bit words. This field can be used for direct access to the beginning of the ALC payload. FEC encoding ID (FEC_ENC_ID): 8 bits Specifies the type of FEC encoding being used and, implicitly, the specification of the decoder [FEC00]. FEC_ENC_ID also determines the internal format of the FTI and FPI fields described below. This field is the instantiation of the "FEC encoding ID" abstract field defined in [FEC00]. Congestion Control ID (CCID): 3 bits Specifies the congestion control algorithm being used and, implicitly, the meaning of the CCI field. This field is the instantiation of the "Congestion Control ID" field defined in [MRCC00]. FEC Encoding flag (E): 1 bit Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 17] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 E = 1 indicates that the packet payload contains only source symbols. E = 0 indicates redundant symbols in the payload. The meaning of this field is not defined for some values of FEC_ENC_ID, when this happens E MUST be set to 0 by the sender. This field is the instantiation of the "encoding flag" abstract field defined in [FEC00]. reserved: 4 bits FEC payload-ID length (F): 1 bit Length of FEC payload ID field (FPI). F = 0 indicates the FPI field is 32-bits long. F = 1 indicates the FPI field is 64-bits in length. Object Transmission Label length (TLL): 2 bits Length of object transmission label field (OTL). An TLL value of 0 means "OTL field not present"; TLL values of 1, 2 and 3 mean 32, 64, 128 bits OTL-field length respectively. FEC-object Transmission-Information length (TIL): 2 bit Length of FEC object transmission information field (FTI) in units of 32-bit words. TIL = y indicates the FTI field is y words in length. TIL = 0 means that the FTI field is not present. Residual Object Transmission Time flag (R): 1 bit R = 1 indicates that the field Residual Object Transmission Time (ROT) is present. R = 0 indicates "no ROT field". Source Authentication flag (A): 1 bit A = 1 indicates that the Source Authentication header field (SAI) is present. A = 0 indicates "no SAI field". Header-extension fields flag (X): 1 bit X = 1 indicates that additional fields, beside the ones specified in this section, are present in the ALC header. X = 0 indicates no additional header fields is present. See the "Header Extension" section for the definition of additional header field. Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 18] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 FEC Payload ID (FPI): 1 -:- 2 x 32 bits The FEC Payload ID field identifies the content of the ALC payload for the purpose of decoding. For example this may be the the ID of the FEC symbol carried in the packet and its encoding block number, if the object is partitioned into blocks. The exact format and interpretation of the FPI field is dependent on the value of FEC_ENC_ID as specified in [FEC00]. The FPI field is the instantiation of the "FEC Payload ID" abstract field defined in [FEC00]. The length of the FEC payload ID field is determined by the F field. Object Transmission Label (OTL): 0,1,2 or 4 x 32 bits Its content is opaque to receivers and is used by them to de- multiplex different objects being transmitted within a single session and possibly to associate the objects to their description provided out of band. In the scope of a session, each label MUST be uniquely associated to a single object, the opposite in NOT REQUIRED. The length of the Object transmission label field is determined by the TLL field. FEC object transmission information (FTI): 0 -:- 3 x 32 bits This field defines the parameters of the FEC coding scheme and of its application to the object being encoded. For example this could include the length of the object and the source block length, in the case of block-codes. The format of this field is dependent on the value of FEC_ENC_ID, as specified in [FEC00]. The FTI field is the instantiation of the "FEC Transmission Information" abstract field specified in [FEC00]. The length of the FTI field is determined by the TIL field. When the FTI field is not present in the ALC header, this information MUST be communicated out of band. Continuation Request flag (C): 1 bit reserved: 1 bit Residual Object Transmission Time (ROT): 30 bits These three fields encode the "Object Transmission Status" information, used to implement the OPTIONAL in-band 'transmission- extension' mechanism. Their presence is controlled by the field R: R = 1 means that they are present. The field ROT Represents the time left until the end of the transmission of the Object, expressed in ms (milliseconds). The field C indicates whether Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 19] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 receivers are allowed to send continuation request using the Request Packet: C = 1 means that they are allowed. Source authentication information (SAI): variable length. This field is used to authenticate the content of the packet, it is a self-descriptive, variable-length field. Its format is defined below. The SAI field is only present if A = 1. Although the ALC packet headers may have different formats, All ALC packets pertaining to the same session MUST have the same header format, i.e. all optional field MUST be either present or not-present in all packets and all variable-length header MUST keep a constant length. Also the FEC encoding type MUST NOT vary within the session, which implies that FEC_ENC_ID MUST NOT vary either. The encoding parameters carried in the FTI field MUST be the same within the same object, but MAY vary across different sessions for different objects, even if the sessions occur sequentially in time using the same set of ALC groups for each session. 8.1.1. Source Authentication Information Field The format of the SAI field is depicted below. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SAL | SAT | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + . . . Actual Source Authentication Information (variable length) . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The explanation of each sub-field is the following. Source Authentication Information length (SAL): 8 bits The length of the whole SAI filed included the SAL sub-field, expressed in multiples of 32-bit words. Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 20] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 Source Authentication Information type (SAT): 8 bits The type of the authentication algorithm being used. This field also determines the structure of the ASA sub-field. Actual Source Authentication Information (ASA): variable length Used by receivers to authenticate the content of the packet. Its internal structure is implicitly defined by the value of the SAT sub-field. Values of the SAT sub-field, their association to an authentication scheme and the format of the ASA sub-field are specified in a separate document. 8.2. Request-Packet content 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | V | HDR_LEN | reserved |TLL|resvd|A|X| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Object Transmission Label (0,1,2 or 4 32-bit words) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |res| Requested Object Transmission Extension | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Receiver Authentication Information (variable length) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ALC Request header layout ALC version number (V): 3 bits The ALC version number for this specification is 0. Non-fixed ALC header Length (HDR_LEN): 5 bits Length of the non-fixed portion of the ALC header in units of 32-bit words, starting form the 3rd 32-bits word in the ALC header up to the end of the request packet. I.e. this is the total length length of the request packet excluding the first two 32-bit Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 21] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 words (request packets do not contain payload). reserved: 17 bits Object Transmission Label length (TLL): 2 bits Length of object transmission label field (OTL). An TLL value of 0 means "OTL field not present"; TLL values of 1, 2 and 3 mean 32, 64, 128 bits OTL-field length respectively. reserved: 3 bits Receiver Authentication flag (A): 1 bit A = 1 indicates that the Receiver Authentication header field (RAI) is present. A = 0 indicates "no RAI field". Additional header-fields flag (X): 1 bit X = 1 indicates that additional fields, beside the ones specified in this section, are present in the ALC header. X = 0 indicates no additional header fields is present. See the "Header Extension" section for the definition of additional header field. Object Transmission Label (OTL): 0,1,2 or 4 x 32 bits This is the Object Transmission Label of the object for which the extension is requested. This field is the copy of the analogous field present in data packets. The OTL field is omitted from Request packets when the analogous field is not present in Data packets. reserved: 2 bit Requested Object Transmission Extension (ROE): 30 bits Represents the requested time until the end of the transmission of the Object, expressed in ms (milliseconds). This value is computed as the Residual Object transmission Time (ROE) plus the extension needed by the receiver. Receiver authentication information (RAI): variable length. This is an optional field used to authenticate the receiver that sends the request. Its structure is identical to the SAI field Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 22] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 structure. Actual authentication mechanisms may be defined in a companion document, similarly to what specified for the SAI field. The RAI field is only present if A = 1. The Request-packet header can be extended using the same mechanism used for data packet headers (see section "Header-Extension Fields"). 8.3. Header-Extension Fields To allow for unanticipated or rarely used additional header fields and to extend the size of some of the predefined fields, the ALC header contains an additional header fields flag "X". If "X" is set to 0 then no additional header fields are included within the ALC header beyond the predefined fields. When additional headers beyond the predefined fields are used, the value of "X" within the ALC header MUST be set to 1. Header-extension fields have the standard format depicted below. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |L| HEL | HET | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + . . . Header Extension Content . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5 - format of additional headers The explanation of each sub-field is the following. Last Header Extension (L): 1 bit MUST be set to 1 in the last Header Extension field present in a packet header, MUST be set to 0 in all the others. Header Extension length (HEL): 7 bits Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 23] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 The length of the whole Header Extension field expressed in multiples of 32-bit words. Header Extension type (HET): 8 bits The type of the header extension. This document defines a number of possible types. Additional types may be defined in future version of this specification. Header Extension Content (HEC): variable length The content of the Header Extension. The format of this sub-field depends on the header extension type. The originator of a packet with header extensions SHOULD not leave additional space between the end of the last Header Extension and the beginning of the ALC payload. The recipient of a packet MUST ignore any data present between the end of the last header extension field and the beginning of the ALC payload. The following header extension types are defined: 0 Reserved EXT_CCI = 1 Congestion Control Information extension. This extension field extends the CCI field present in the fixed part of the header. It is used when the congestion control information does not fit in the 32 bits CCI field. EXT_FPI = 2 FEC Payload ID extension. This extension field extends the FPI field of the fixed header. It is used when the FEC payload ID information does not fit in 64 bits (maximum length of FPI). EXT_OTL = 3 Object Transmission Label extension. This extension field extends the OTL field of the fixed header. It is used when the object transmission label does not fit in 128 bits (maximum length of OTL). EXT_FTI = 4 FEC object transmission information extension. This extension field extends the FTI field of the fixed header. It is used when the FEC object transmission information does not fit in 96 bits (maximum length of FTI). Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 24] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 9. Procedures 9.1. Sender Operation Within a ALC session, An ALC sender transmits a sequence of packets containing encoding symbols (either source symbols and/or redundant symbols) addressed to one or more multicast groups which together constitute a session. Transmission rates may be different in different groups. This document does not specify the policy used to place symbols into packets, nor the order in which packets are transmitted, nor the scheduling of packets in multiple groups. Although these issues affect the efficiency of the protocol, they do not affect is the correctness nor the inter-operability between senders and receivers. To address these issues, ALC implementors SHOULD follows the guidelines provided in the introductory sections of this document. If multiple senders are transmitting data about the same object in different sessions, congestion control must be performed independently on each session, although a receiver may benefit by increasing its rate and reliability of reception by participating in more than one session for the same object concurrently. Multiple ALC sessions may transport multiple objects using the same set of multicast groups, either transmitted at different times or intermixed. Typically, the sender(s) continue to send encoding symbols in a session until the transmission is complete. The transmission may be considered complete when some time has expired, a certain number of encoding symbols have been sent, or some out of band signal (from a higher level protocol, perhaps) has indicated completion by a sufficient number of receivers. Feedback through ALC Request packets MAY also be used to determine the end of the session. All encoding symbols in an ALC session MUST be the same size. The sender(s) may determine the symbol size arbitrarily, but must coordinate or use a default symbol size to ensure that all encoding symbols are of equal length. Larger encoding symbols sizes are generally desirable for FEC decoding efficiency reason. ALC does not require that each data packet contains a single symbol, however we believe that this is by far the most common case, hence in the following we will assume it. An additional reason to use large symbol sizes, and hence large packet sizes is to reduce the overhead apported by the combination of IP + UDP + ALC headers. On the other hand, if the packet size is larger than the network's maximum transmission unit (MTU), packets would be fragmented, introducing inefficiency in case of network packet loss. It should also Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 25] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 be considered that the decoding of large object may require the use of secondary storage, for which symbol sizes of 512 bytes or multiples of it are appropriate. For all the above reasons we RECOMMEND to use an ALC payload size of either 512 or 1024 bytes. An ALC sender MUST take part to the implementation of a congestion control strategy in accordance to RFC2357 [MRBP98]. [MRCC00] specifies the details of this. Together with the encoding symbols, an ALC sender MUST provide additional information regarding the symbols being transmitted, the objects provided and the session. The sender MAY also provide additional optional information. Part of this information must be provided within ALC, using the above specified packet headers, part must be provided out of band and part may be provided either out of band or in the ALC headers. An ALC sender MUST provide: Congestion Control Information, consisting on: Identification of the congestion control algorithm being used. This must be provided before or with the start of the session. The sender MUST use the ALC header field CCID for this purpose. It may additionally use out of band mechanisms. Parameter of the Congestion control algorithm (e.g. rates of different layers). This MUST be provided either in the ALC header (part of the CCI field) or out of band, as defined in [MRCC00] for the particular congestion control algorithm being used. Per-packet congestion control information (e.g. sequence numbers to detect packet loss). The CCI field of the data-packet header MUST be used for this purpose. [MRCC00] defines the content and encoding of this information. FEC information FEC encoding Identifier Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 26] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 This must be provided in the FEC_ENC_ID data-header field. FEC Object Transmission Information It MAY be provided in the FTI data-header fields ([FEC00] defines its format). If the FTI field is not present in the data-packet header, the FEC Object Transmission Information MUST be provided out of band for each object prior the start of the object transmission. The content of this is defined by [FEC00] for the FEC_ENC_ID being used. FEC Payload Identifier This MUST be provided for each symbol transmitted, using the FPI data-header field. The format of FPI is defined in [FEC00] for the particular FEC_ENC_ID being used. FEC Encoding flag This is the F flag in the data-packet header. It MUST be set to 1 if the packet contains only source symbols. A sender MUST NOT change the FEC encoder or the congestion control algorithm within a session. An ALC sender MAY also provide: Object Transmission Label Object Transmission Label may be provided either out of band or using the OTL data-header field. If a session is used to transmit multiple objects, than the relative transmission labels MUST be provided in the data-header to enable receivers to demultiplex packets belonging to different objects. Source Authentication Information This has the purpose of authenticating the packet content and is contained in the SAI data-header field. Object Transmission Status This is contained in the ROT data-header field. If the sender(s) provide this information, it can also be required to handle Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 27] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 Request Packets, depending on the value of the C flag. The action that the sender(s) must perform in this case are specified in the "Session Extension" section below. 9.2. Receiver Operation Receivers can operate differently depending on the delivery service model. For example, for an on demand service model receivers may join a session, obtain the necessary encoding symbols to reproduce the object, and then leave the session. As another example, for a push service model a receiver may be tuned in to a continuous session announcement multicast group or channel to determine when objects of interest are scheduled for transmission. Then, at the appropriate time the receiver automatically joins the session to download objects of interest. As another example, for a streaming service model a receiver may be continuously joined to a set of multicast groups to download all objects in sequential session all using the same set of ALC groups. To be able to participate to a session, receivers MUST first obtain the multicast group(s) and UDP port number(s) used for the session. This information is available through some out of band protocol. In certain cases (e.g. when multicast routing protocols inspired to [HOL99] are used) receivers MUST also obtain the IP address of the sender(s). At the moment of joining the session, receivers MUST either o have already identified the congestion control algorithm being used through out of band means; OR o acquire this information through the CCID field of the header first data packet received. In either case receiver MUST implement the congestion control algorithm specified. If a receiver is not able to implement the congestion control algorithm used in the session, it MUST NOT join the session or MUST drop it immediately, if the CCID is learned after having joined the session. When the session is transmitted on multiple multicast groups receivers MUST join it by subscribing to the first group only, unless they have already learned the type of congestion control algorithm through out of band means. Once they know the type of congestion control in use, they are allowed to join other groups in accordance to the congestion control Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 28] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 algorithm itself. This rule has the purpose of preventing receivers from starting at high data rates, if the congestion control algorithm is based on layered transmission on multiple groups. For this to work, the list of groups provided to the receivers MUST be sorted and the first group in the list must be "base layer", if a layered congestion control algorithm is used. To be able to participate to a session, receivers MUST be able to implement the decoder for the FEC encoding in use. If source authentication information is present in data packets, receivers MAY use it to authenticate the packet content. If object transmission status information is present in data packets and the C flag is set, then receivers MAY originate Request packets to extend the transmission of an object. See "Session Extension" section for more details. 9.3. Session Extension ALC defines an OPTIONAL mechanism for the sender(s) to advertise the remaining transmission time of an object in a session. This information MAY be used by receivers to request an extension of the transmission time of the object. A sender MAY use the ROT field in data-packet headers to advertise the remaining transmission time for an object. This time is expressed in milliseconds. A sender MAY additionally set the C flag to 1, indicating that receivers MAY originate requests for transmission extension through the ALC Request packet defined above. If a sender sets the C flag to 1, it MUST be prepared to accept requests for transmission extension and process them. The sender MUST *either* honor the request or immediately indicate that it is not willing to accept further requests for that object, by setting C to 0 in any subsequent data packet transmitted for that object. When a sender decides to honor a request, it MUST immediately reflect this decision back to the receivers by increasing the value of ROT to a value equal or larger than the one present in the request packet. The new value of ROT must be advertised in subsequent data packets transmitted for the object. Failure to do so may result in implosion of receiver-originated requests. Receivers MAY learn the residual transmission time for an object through Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 29] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 the ROT field optionally present in data packets. If the C flag in these data packet is set to 1, then receivers MAY originate request for transmission extension for an object using Request ALC packets. The object(s) for which receivers can generate transmission-extension requests are those whose symbols are transmitted in data packets with C flag set to 1. The association from a data packet to an object is given by the Object Transmission Label (OTL) field in the packet. If this field is not present receivers MAY assume that the session is used to transmit a single object and still originate extension request for that object (provided that the value of the C flag allows it). Receivers MUST NOT originate transmission extension request if the C flag is set to 0 or if this flag is not present in the data-packet header. Similarly to Data packets, ALC Requests are encapsulated in UDP packets and sent to the unicast IP address of the node designated to receive extension request. The UDP destination port to use is the destination port used for multicast data packets, unless otherwise specified through out of band means. By default the destination IP address is the source address used in the Data packets. Out of band mechanism MAY override this default behavior by explicitly designating the node to which extension requests must be sent. Note that when multiple senders are sending to a session and a specific node that accepts requests is not designated, then receivers MAY pick any of the senders as destination of their Request packets. In this case all the senders MUST coordinate in their reaction to requests (e.g. when they increase the residual transmission time or when they decide to set C to 0). A receiver that originates a Request packet for an object whose Transmission Label was advertised in Data packets MUST copy the Object Transmission Label in the OTL field of the Request packet. The Requested Object transmission Extension (ROE) field must be computed as the Residual Object transmission Time (ROE) plus the extension requested by the receiver. Receivers MUST implement feedback-implosion avoidance procedures as follows: o Receivers use the Residual Object transmission Time advertised by the sender(s) to predict whether or not they will be able to recover the object from the packets they have already received and from the packets they can expect to receive in the future. This prediction SHOULD also consider data-rate fluctuations caused by congestion control adaptations. Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 30] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 o When a receiver predicts that the residual object transmission time is not sufficient to successfully recover the object, it MAY schedule the transmission of an extension request at a random time in the future, before the scheduled end of the transmission. o When a receiver has a pending extension request scheduled for transmission, it must keep monitoring the progress of the reception and cancel the pending request if either of the following happens: - The residual object transmission time becomes larger the predicted time needed to complete the reception. - A Data packet for the object of interest is received with the C flag set to 0 or without the C header field. o A receiver MUST cancel pending extension-request when the transmission time of an object is over. The rules stated above are not sufficient to obtain a good implosion prevention in all the cases. For improved performance the following guidelines SHOULD be followed: o Extension requests should be *scheduled* only when the reception of the object is in an advanced status of completion (e.g. more than 50%). This improves the accuracy of the receivers' prediction reducing the chance that an extension is requested uselessly. o The time needed for a Request to suppress pending Request from other receivers is approximatively a packet round trip time (unicast request to the sender and multicast data packets to the receivers). Using random-time scheduling for requests is an effective suppression mechanism only if the the interval from which to select the actual transmission time is much larger than a round trip time. For this reason extension requests should be *scheduled* at least a few seconds before the end of transmission. 10. ALC and Generic Router Assist Router filtering of ALC packets to assist in congestion control is described in [LUB99]. The addresses of the multicast groups or channels can be communicated to the routers and they can do filtering of groups or channels based on congestion. This is one of the reasons why it is good to have the sequence number information in a fixed place in the ALC Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 31] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 packet header, so that the routers can probe this field if necessary (although it may not be necessary). There are some interesting strategies for combining router assisted congestion control with receiver congestion control in such a way that ALC will perform well when only receivers are doing congestion control, and will perform even better with router assist. Describing these strategies is outside the scope of this document. 11. IANA Considerations The identifiers of FEC encoding (FEC Encoding Identifier), congestion control algorithm (Congestion Control Identifier) and source authentication scheme (Source Authentication Information Type) are subject to IANA registering. This issues is better discussed in the building-block specifications (e.g. [FEC00] and [MRCC00]). Building blocks may introduce additional IANA considerations. 12. Intellectual Property Issues Digital Fountain has patents pending for Tornado codes and for LT codes. Digital Fountain has patents pending for congestion control protocols that may be needed to use some of the congestion control protocols described in this document in a commercial product or service. Digital Fountain is willing to provide a royalty free license to the rights it holds that are needed to use the congestion control protocols described in this document if and when these congestion control protocols become part of the IETF standards. 13. Acknowledgments Thanks to Hayder Radha and Vincent Roca for detailed comments on this paper. 14. References [MRCC00] "Congestion Control for multi-rate building block", draft to be written for submission to the IETF. [AFZ95] Acharya, S., Franklin, M., and Zdonik, S., Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 32] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 "Dissemination-Based Data Delivery Using Broadcast Disks", IEEE Personal Communications, pp.50-60, Dec 1995. [BLA94] Blahut, R.E., "Theory and Practice of Error Control Codes", Addison Wesley, MA 1984. [R2119] Bradner, S., Key words for use in RFCs to Indicate Requirement Levels (IETF RFC 2119) http://www.rfc-editor.org/rfc/rfc2119.txt [BYE98] Byers, J.W., Luby, M., Mitzenmacher, M., and Rege, A., "A Digital Fountain Approach to Reliable Distribution of Bulk Data", Proceedings ACM SIGCOMM '98, Vancouver, Canada, Sept 1998. [BYE00] Byers, J.W., Frumin, M., Horn, G., Luby, M., Mitzenmacher, M., Roetter, A., "Improved Congestion Control for IP Multicast Using Dynamic Layers", Digital Fountain research paper, June 2000. [CAI99] Cain, B., Speakman, T., and Towsley, D., "Generic Router Assist (GRA) Building Block, Motivation and Architecture", Internet Draft draft-ietf-rmt-gra-arch-00.txt, a work in progress. [DEE88] Deering, S., "Host Extensions for IP Multicasting", RFC 1058, Stanford University, Stanford, CA, 1988. [R2068] Fielding, R., Gettys, J., Mogul, J. Frystyk, H., Berners-Lee, T., Hypertext Transfer Protocol - HTTP /1.1 (IETF RFC2068) http://www.rfc-editor.org/rfc/rfc2068.txt [GEM99] Gemmell, J., Schooler, E., and Gray, J., "FCast Scalable Multicast File Distribution: Caching and Parameters Optimizations", Technical Report MSR-TR-99-14, Microsoft Research, Redmond, WA, April, 1999. [HAN98] Handley, M., and Jacobson, V., "SDP: Session Description Protocol", RFC 2327, April 1998. [HAN96] Handley, M., "SAP: Session Announcement Protocol", Internet Draft, IETF MMUSIC Working Group, Nov 1996. [HOL99] Holbrook, H., Cheriton, D., "IP Multicast Channels: Experss Support for Large-scale Single-source Applications", ACM SIGCOMM'99 [HOR00] Horn, G., Luby, M., Mitzenmacher, M., "Fair Layered Increase/Decrease Congestion Control", Digital Fountain research paper, June 2000. Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 33] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 [FEC00] Luby, M., Gemmell, J., Vicisano, L., Rizzo, L., Crowcroft, J., Luekenhoff, B., "Reliable Multicast Transport Building Blocks: Forward Error Correction", Internet Draft draft-ietf-rmt-bb-fec-00.txt, a work in progress. [LUB99] Luby, M., Vicisano, L., Speakman, T. "Heterogeneous multicast congestion control based on router packet filtering", presented at RMT meeting in Pisa, March 1999. [LUB00] Luby, M., "An Overview of LT codes", Digital Fountain technical paper, July 2000. [MRBP98] Mankin, A., Romanow, A., Brander, S., Paxson, V., "IETF Criteria for Evaluating Reliable Multicast Transport and Application Protocols," RFC2357, June 1998. [MRCC00] Luby, M., Vicisano, L., Haken, A., "Reliable Multicast Transport Building Block: Multirate Congestion Control", yet to be specified. [POST80] J. Postel, "User Datagram Protocol", RFC768, August 1980. [RIZ97a] Rizzo, L, and Vicisano, L., "Reliable Multicast Data Distribution protocol based on software FEC techniques", Proceedings of the Fourth IEEES Workshop on the Architecture and Implementation of High Performance Communication Systems, HPCS'97, Chalkidiki, Greece, June 1997. [RIZ97b] Rizzo, L., and Vicisano, L., "Effective Erasure Codes for Reliable Computer Communication Protocols", ACM SIGCOMM Computer Communication Review, Vol.27, No.2, pp.24-36, Apr 1997. [RIZ97c] Rizzo, L., "On the Feasibility of Software FEC", DEIT Tech Report, http://www.iet.unipi.it/ luigi/softfec.ps, Jan 1997. [RUB99] Rubenstein, D., Kurose, J. and Towsley, D., "The Impact of Multicast Layering on Network Fairness", Proceedings of ACM SIGCOMM '99. [VIC98A] Vicisano, L., Rizzo, L., Crowcroft, J., "TCP-like Congestion Control for Layered Multicast Data Transfer", IEEE Infocom '98, San Francisco, CA, Mar.28-Apr.1 1998. [VIC98B] Vicisano, L., "Notes On a Cumulative Layered Organization of Data Packets Across Multiple groups with Different Rates", University College London Computer Science Research Note Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 34] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 RN/98/25, Work in Progress (May 1998). A File Transfer using ALC - 'Fcast' While ALC can be used to transfer arbitrary objects, file transfer is expected to be a primary application. This appendix describes a standard for file transfer, called "Fcast". When the object being transmitted is a file, the receiver usually requires "metadata" in addition to the file itself, such as the file name, its creation date, etc. Metadata can be sent a number of ways. For example, it could be part of the object session description or sent as a separate ALC object with a well-known object transmission label. The method described here combines the file with its metadata in a single ALC object. The metadata is logically appended to the end of the file as a trailer and sent as part of the transmission. The file with appended trailer is referred to as the extended file. In order to find the beginning of the trailer, the last 4 bytes in the object indicate the length of the trailer. Including metadata is OPTIONAL, so the length may be zero. Figure 8 shows the layout of the Fcast file transmission. The metadata is appended rather than prepended to the file so that the file length may be corrected by simply truncating the extended file (rather than requiring re-writing). The nature of ALC does not ensure that the start of the file is received before the end, so there is no drawback to using a trailer in terms of latency to get the information. +----------------------------------------------+ | Object | 4-byte checksum | Null padding| +----------------------------------------------+ / \\ | ........... | \\ | .......... / \\ +--------------------------------------+ | File data | Trailer | Trailer Length | +--------------------------------------+ Figure 8 - Fcast file transmission: The ALC object is the file data, followed by the meta-data trailer, followed by the trailer length (in bytes - 4 byte value) Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 35] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 The metadata in the file trailer is stored as ASCII text. It carries the same information as HTTP object metainformation header fields, as defined in RFC 2068 [R2068]. As in the RFC, field names should be followed by a colon, followed by the field value, followed by a CR-LF. The header fields that are RECOMMENDED/OPTIONAL to be supported by all receiving clients are listed below and should be interpreted as per RFC 2068. RECOMMENDED fields: Content-Location (indicates the URI for the file - could be just the file name) Last-modified OPTIONAL fields: Content-Length (indicates the length of the file) Content-Encoding Content-Type Content-Base Content-Language Content-Style-Type Date Expires Any other header field defined in RFC 2068 or subsequent HTTP standards. Example: Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 36] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 Content-Length : 1024 Content-Location : http://www.foo.com/myfile.zip 15. Authors' Addresses Michael Luby luby@digitalfountain.com Digital Fountain 600 Alabama Street San Francisco, CA, USA, 94110 Jim Gemmell jgemmell@microsoft.com Microsoft Research 301 Howard St., #830 San Francisco, CA, USA, 94105 Lorenzo Vicisano lorenzo@cisco.com cisco Systems, Inc. 170 West Tasman Dr., San Jose, CA, USA, 95134 Luigi Rizzo luigi@iet.unipi.it Dip. di Ing. dell'Informazione Universita` di Pisa via Diotisalvi 2, 56126 Pisa, Italy Jon Crowcroft J.Crowcroft@cs.ucl.ac.uk Department of Computer Science University College London Gower Street, London WC1E 6BT, UK Bruce Lueckenhoff brucelu@cadence.com Cadence Design Systems, Inc. 120 Cremona Drive, Suite C Santa Barbara, CA 93117 Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 37] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 16. Full Copyright Statement Copyright (C) The Internet Society (2000). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 38] Draft RMT PI, Asynchronous Layered Coding 13 July 2000 Table of Contents 1 Introduction .................................................... 3 2 Related Documents ............................................... 5 3 Applicability ................................................... 6 4 General Architecture ............................................ 6 5 Minimizing reception overhead ................................... 8 5.1 Division into blocks .......................................... 9 5.2 Block FEC codes ............................................... 11 5.2.1 A single group .............................................. 11 5.2.2 Multiple groups ............................................. 12 5.3 Inherent overhead ............................................. 13 6 Delivery service models ......................................... 13 7 Congestion Control .............................................. 14 8 Packet Content .................................................. 14 8.1 Data-Packet content ........................................... 15 8.1.1 Source Authentication Information Field ..................... 20 8.2 Request-Packet content ........................................ 21 8.3 Header-Extension Fields ....................................... 23 9 Procedures ...................................................... 25 9.1 Sender Operation .............................................. 25 9.2 Receiver Operation ............................................ 28 9.3 Session Extension ............................................. 29 10 ALC and Generic Router Assist .................................. 31 11 IANA Considerations ............................................ 32 12 Intellectual Property Issues ................................... 32 13 Acknowledgments ................................................ 32 14 References ..................................................... 32 A File Transfer using ALC - 'Fcast' .............................. 35 15 Authors' Addresses ............................................. 37 16 Full Copyright Statement ....................................... 38 Luby, Gemmell, Vicisano, Rizzo, Crowcroft, Lueckenhoff [Page 39]