INTERNET-DRAFT John Lazzaro April 11, 2002 John Wawrzynek Expires: October 11, 2002 UC Berkeley The MIDI Wire Protocol Packetization (MWPP) Status of this Memo This document is an Internet-Draft and is subject to all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Abstract The MIDI Wire Protocol Packetization (MWPP) is a general-purpose RTP packetization for the MIDI command language. MWPP is suitable for use in both interactive applications (such as pseudo-wire emulation of MIDI cables) and content-delivery applications (such as MIDI file streaming). MWPP is designed for use over unicast and multicast UDP, and defines MIDI-specific resiliency tools for the graceful recovery from packet loss. A lightweight configuration of MWPP supports efficient use over TCP. MWPP is compatible with the MPEG-4 generic RTP payload format, to support MPEG 4 Audio codecs that accept MIDI control input. Lazzaro/Wawrzynek [Page 1] INTERNET-DRAFT 11 April 2002 0. Change Log for o Entire document rewritten and reorganized. o New system journal protects MIDI Systems commands (Section 5, Appendices B.1-5). o Several changes to the MIDI Command Section format, (Section 3) in response to comments from Dominique Fober, Phil Kerr, and Martijn Sipkema: -- New default semantics for MIDI command timestamps; new SDP parameters for customizing timestamps. -- Enforced monotonicity of MIDI command timestamps. -- Only System Realtime commands are permitted between SysEx command segments. -- New termination octets for SysEx command segments. -- New method of coding "dropped F7" SysEx construction. o In response to comments from Dominique Fober, the standard SDP attribute maxptime is now used to request a minimum MWPP sending rate, to simplify clock-skew compensation algorithms (Section 2). o In response to comments from Dominique Fober, Phil Kerr, and Martijn Sipkema, the new SDP parameter midiport associates an arbitrary integer value with an MWPP stream, to label the MIDI namespace of the stream. Multiple streams can target the same namespace, for MIDI merge and hybrid transport schemes; more commonly, multiple streams will have unique namespaces, to support MIDI applications that use a large number of MIDI channel (Section 2). o In response to comments from Dominique Fober, recovery journal Chapters C and P no longer redundantly code bank select data, and Chapter C acts to code unpaired registered/non-registered parameter number commands. o New SDP parameters to customize coverage of the recovery journal (Section 6). New definition for the G bit in the recovery journal (Sections 4 and 5). o New Acknowledgements section (Section 10). Lazzaro/Wawrzynek [Page 2] INTERNET-DRAFT 11 April 2002 1. Introduction The MIDI standard [1] defines a command set that describes sound as a series of events (NoteOn command to start a musical note event, NoteOff command to end a note, etc). The command execution time is not specified in the MIDI command syntax, so that each sub-part of the standard may customize execution time coding to its requirements. For example, the MIDI file format provides an explicit timestamp for each command, but the MIDI wire protocol codes execution time in an implicit way, as the time of arrival of commands on an asynchronous serial line. This memo describes a general-purpose RTP packetization for the MIDI command set, that is capable of coding MIDI streams whose original execution time encoding takes an implicit or an explicit form. The packetization is suitable for both interactive applications (such as pseudo-wire emulation of MIDI cables) and content-delivery applications (such as MIDI file streaming). The packetization is named the MIDI Wire Protocol Packetization (MWPP), due to its origins in network musical performance research [6]. MWPP is a modular packetization. The simplest form of MWPP uses the MIDI command section (described in Sections 2 and 3) as a complete self- framed RTP payload. This lightweight version of MWPP is suitable for use over reliable transport such as TCP. MWPP is also suitable for use over unreliable transport such as unicast and multicast UDP. MWPP provides resiliency by inserting a recovery journal section (described in Sections 4 and 5) into each RTP packet. The recovery journal codes the recent history of the stream. MWPP uses the Session Description Protocol (SDP) to configure stream properties. SDP options (described in Sections 6 and 7) support multiple MIDI namespaces, control of command execution time semantics, fine- grained control of recovery journal coverage, and compatibility with the MPEG-4 generic RTP format [3] [5]. This memo assumes a working knowledge of MIDI networking issues. Readers unfamiliar with the application domain may wish to examine introductory materials [6] [7] [8] before reading this memo. 2. MWPP Packet Format. Figure 1 shows the format of an MWPP packet. An MWPP packet has two or three sections: the RTP header, the MIDI command section, and the optional recovery journal. In Figure 1, vertical space delineates the RTP header and the payload. Lazzaro/Wawrzynek [Page 3] INTERNET-DRAFT 11 April 2002 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC |M| PT | Sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CSRCs | | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MIDI command section ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Recovery journal ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1 -- MWPP packet format An MWPP packet begins with an RTP header. The marker bit is always set to 1, for compatibility with the MPEG 4 generic payload format [3]. The RTP sequence number increments by one (modulo 65536) for each packet sent. As is standard in RTP, the sequence number is initialized to a randomly chosen value. MWPP does not use header extensions. The RTP timestamp sets the base timestamp value for the packet. The event times coded in the MIDI command section are specified relative to this timestamp. If the MIDI command section carries no events, the timestamp indicates the instant the RTP packet was encoded. The RTP timestamp has the units of the SDP rtpmap parameter srate (see Section 6). For example, if srate has a value of 44100 (Hz), two MWPP packets whose base timestamp values differ by 2 seconds have RTP timestamps that differ by 88200. MWPP RTP timestamps do not necessarily increment at a fixed rate. The timestamps for two sequential RTP packets may be identical, or the second packet may have a timestamp arbitrarily larger than the first packet (modulo 2^32). As is standard in RTP, the timestamp field is initialized to a randomly chosen value. The optional SDP attribute maxptime (defined in [9]) specifies the maximum amount of media time an MWPP packet encodes. The media time of an MWPP packet is the RTP timestamp difference (modulo 2^32) between consecutively sent packets. Applications set maxptime if a minimum rate Lazzaro/Wawrzynek [Page 4] INTERNET-DRAFT 11 April 2002 of RTP packet transmission is required, independent of the source rate of MIDI event data, for the benefit of algorithms performing clock-skew compensation, network latency estimation, and packet loss recovery. The MWPP payload begins with the variable-length MIDI command section, described in detail in Section 3. The commands encoded in this section reference a single MIDI namespace (16 MIDI channels + MIDI Systems). The SDP rtpmap parameter midiport (see Section 6) associates this namespace with an arbitrary integer value. Applications may support large MIDI namespaces by creating several MWPP streams, each with a different midiport value. In this case, applications SHOULD independently choose initial RTP random timestamp offsets for each stream, and MAY choose different srate values for each stream. To synchronize the streams, applications SHOULD use the standard RTP synchronization tools [2]. In addition, applications may create several MWPP streams that share the same MIDI namespace, by assigning the same midiport value to each stream. For example, a unicast application may use a UDP stream to send real-time oriented MIDI commands, but use a TCP stream for the reliable transport of MIDI Sample Dump commands. All MWPP streams that share the same midiport value MUST use the same RTP timestamp timebase (SDP srate parameter + initial randomly chosen RTP timestamp offset). If a stream is configured for resiliency, every MWPP packet includes a variable-length recovery journal section, described in detail in Sections 4 and 5. If a stream is not configured for resiliency, the recovery journal never appears in the MWPP payload. The SDP rtpmap parameter rj (see Section 6) configures an MWPP stream for resiliency. 3. MIDI Command Section Figure 2 shows the format of the MIDI command section. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |B|Z| LEN ... | MIDI list ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2 -- MIDI command section The MIDI command section begins with a variable-length header. The header field LEN codes the length (in units of octets) of the MIDI list that follows the header. Lazzaro/Wawrzynek [Page 5] INTERNET-DRAFT 11 April 2002 If the header flag B is 0, the header is one octet long, and LEN is a 6-bit field, supporting a maximum MIDI list length of 63 octets. If B is 1, the header is two octets long, and LEN is a 14-bit field, supporting a maximum MIDI list length of 16383 octets. A LEN value of 0 is legal, and codes an empty MIDI list. If the MIDI list is empty, the RTP timestamp indicates the instant the RTP packet was encoded. If LEN is nonzero, the MIDI list has the structure shown in Figure 3. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Delta Time 0 (if Z = 1) | MIDI Command 0 ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Delta Time 1 | MIDI Command 1 ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Delta Time 2 | MIDI Command 2 ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ..... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Delta Time N | MIDI Command N ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3 -- MIDI list structure If the header flag Z is 1, the MIDI list begins with a complete MIDI command (MIDI Command 0) preceded by a delta time (Delta Time 0). If Z is 0, the Delta Time 0 field is not present in the MIDI list, and MIDI Command 0 has an implicit delta time of 0. The MIDI list structure may also optionally encode a list of N additional complete MIDI commands. Each additional command is preceded by a delta time. The MWPP delta time syntax is a modified form of the MIDI File delta time syntax [1]. MWPP delta times use 1-4 octet fields to encode 32-bit unsigned integers. Figure 4 shows the encoded and decoded forms of delta times. Note that delta time values may be legally encoded in multiple formats; for example, there are four legal ways to encode the zero delta time (0x00, 0x8000, 0x800000, 0x80000000). Lazzaro/Wawrzynek [Page 6] INTERNET-DRAFT 11 April 2002 One-Octet Delta Time: Encoded form: 0ddddddd Decoded form: 00000000 000000000 00000000 0ddddddd Two-Octet Delta Time: Encoded form: 1ccccccc 0ddddddd Decoded form: 00000000 00000000 00cccccc cddddddd Three-Octet Delta Time: Encoded form: 1bbbbbbb 1ccccccc 0ddddddd Decoded form: 00000000 000bbbbb bbcccccc cddddddd Four-Octet Delta Time: Encoded form: 1aaaaaaa 1bbbbbbb 1ccccccc 0ddddddd Decoded form: 0000aaaa aaabbbbb bbcccccc cddddddd Figure 4 -- Decoding delta time formats MWPP uses delta times to encode a timestamp for each MIDI command. The timestamp for MIDI Command K is the summation (modulo 2^32) of the RTP timestamp and decoded delta times 0 through K. All command timestamps in a packet MUST be less than or equal to the RTP timestamp of the next packet in the MWPP stream (modulo 2^32). By default, a command timestamp indicates the execution time for the command. The difference between two timestamps indicates the time delay between the execution of the commands. This difference may be zero, coding simultaneous execution. MIDI sources that use explicit command timestamps, such as the MIDI file format, are simple to transcode into MWPP streams using these default semantics. MIDI command sources that use implicit command timing, such as the MIDI wire protocol, must be annotated with timestamps as part of the MWPP transcoding process. The hardware and systems environment for an application may dictate a particular approach to timestamps, that may not be a good fit for the default MWPP timestamp semantics. To address this issue, MWPP timestamp semantics are configurable, via SDP parameters. Section 6 describes these SDP parameters and their use in MIDI wire protocol transcoding. As a rule, each MIDI Command field in the MIDI list contains a complete MIDI command, in the binary command format defined in the MIDI standard [1]. In the remainder of this section, we describe exceptions to this rule. Lazzaro/Wawrzynek [Page 7] INTERNET-DRAFT 11 April 2002 The first MIDI channel command in the MIDI list MUST include a status octet; running status coding, as defined in [1], may be used for all subsequent MIDI channel commands in the MIDI list. As in [1], System Common messages (0xF0 ... 0xF7) cancel running status state, but System Realtime messages (0xF8 ... 0xFF) do not effect running status state. In the MIDI wire protocol [1], a System Realtime command may be embedded inside of another "host" MIDI command. This syntactic construction is not supported in MWPP: a MIDI Command field in the MIDI list codes exactly one complete MIDI command. To encode an embedded System Realtime command, senders MUST extract the command from its host, and code it in the MIDI list as a separate command. The host command and System Realtime command SHOULD appear in the same MIDI list. The delta time of the System Realtime command SHOULD result in a command timestamp that encodes the System Realtime command placement in its original embedded position. Two methods are provided for encoding MIDI System Exclusive (SysEx) commands in the MIDI list. A SysEx command may be encoded in a MIDI Command field verbatim: an 0xF0 octet, followed by an arbitrary number of data octets, followed by an 0xF7 octet. Alternatively, a SysEx command may be encoded as multiple segments. The command is divided into two or more SysEx command segments; each segment is encoded in its own MIDI Command field in the MIDI list. MWPP supports segmentation in order to encode SysEx commands that encode information in the temporal pattern of data octets; by encoding these commands as a series of segments, each data octet is associated with a delta time. Segmentation may also be useful in coding very large SysEx commands across several RTP packets. To segment a SysEx command, first partition its data octet list into two or more sublists; each sublist must contain at least one data octet. To complete the segmentation, add status octets to the head and tail of each sublist, as detailed in Figure 5. Figure 6 shows examples. ----------------------------------------------------------- | Sublist Position | Head Status Octet | Tail Status Octet | |-----------------------------------------------------------| | first | 0xF0 | 0xF0 | |-----------------------------------------------------------| | middle | 0xF7 | 0xF0 | |-----------------------------------------------------------| | last | 0xF7 | 0xF7 | ----------------------------------------------------------- Figure 5 -- Command Segmentation Status Octets Lazzaro/Wawrzynek [Page 8] INTERNET-DRAFT 11 April 2002 Original SysEx command: 0xF0 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0xF7 A two-segment segmentation: 0xF0 0x01 0x02 0x03 0x04 0xF0 0xF7 0x05 0x06 0x07 0x08 0xF7 A different two-segment segmentation: 0xF0 0x01 0xF0 0xF7 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0xF7 A three-segment segmentation: 0xF0 0x01 0x02 0xF0 0xF7 0x03 0x04 0xF0 0xF7 0x05 0x06 0x07 0x08 0xF7 The segmentation with the largest number of segments: 0xF0 0x01 0xF0 0xF7 0x02 0xF0 0xF7 0x03 0xF0 0xF7 0x04 0xF0 0xF7 0x05 0xF0 0xF7 0x06 0xF0 0xF7 0x07 0xF0 0xF7 0x08 0xF7 Figure 6 -- Example segmentations Lazzaro/Wawrzynek [Page 9] INTERNET-DRAFT 11 April 2002 The relative ordering of SysEx command segments in a MIDI list must match the relative ordering of the sublists in the original SysEx command. Only System Realtime MIDI commands may appear between SysEx command segments. If the command segments of a SysEx command are placed in the MIDI lists of two or more RTP packets, the segment ordering rules apply to the concatenation of all affected MIDI lists. The MIDI wire protocol [1] permits a "dropped 0xF7" construction for SysEx commands; in this coding method, the 0xF7 octet is dropped from the end of the SysEx command, and the status octet of the next MIDI command acts both to terminate the SysEx command and start the next command. To encode this construction in MWPP, follow these steps: o Determine the appropriate delta times for the SysEx command and the command that follows the SysEx command. o Insert the "dropped" 0xF7 octet at the end of the SysEx command, to form the standard SysEx syntax. o Code both commands into the MIDI list using the rules above. o Replace the 0xF7 octet that terminates the verbatim SysEx encoding or the last segment of the segmented SysEx encoding with a 0xF6 command. This substitution informs the receiver of the original dropped 0xF7 coding. 4. Recovery Journal Overview In this section we introduce the recovery journal, the MWPP resiliency tool for unreliable transport. In Section 5, we define the bitfield format for the recovery journal; in Section 6, we describe SDP parameters for recovery journal configuration. A MIDI stream sent over MWPP is fragile. Consider an MWPP stream in which one packet codes the start of a trumpet note (via a NoteOn command in the MIDI command section) and a second packet codes the end of the note (via a matching NoteOff command). If the second packet is lost, the trumpet note sustains indefinitely. One solution to loss recovery is to retransmit lost packets. MWPP over TCP provides resiliency via packet retransmission (at a lower layer of the network stack). However, in some MWPP applications packet retransmission is undesirable. Retransmission adds latency, adding a round-trip time for lost packets; if TCP is used, head-of-line blocking latency is also an issue. Simple retransmission is also unsuitable for multicast applications, due to scaling issues. Lazzaro/Wawrzynek [Page 10] INTERNET-DRAFT 11 April 2002 A feed-forward approach to resiliency avoids retransmission by using information encoded in the forward packet stream to guide loss recovery. Consider this simple resiliency scheme for stuck notes: if a receiver detects lost RTP packets via sequence number breaks, it issues NoteOff commands for all active notes as a precaution. This scheme solves the problem of notes that sound forever, but the immediate effect on the stream is jarring: the music stops. The MWPP recovery journal system implements feed-forward resiliency in a more graceful way. Each MWPP packet includes a special section (the "recovery journal") that codes the recent history of the stream. Upon detection of a packet loss, a receiver uses the recovery journal history to guide the stream repair process, fixing long-term problems such as stuck notes while minimizing audible artifacts. The recovery journal does not code a literal history of the MIDI stream. In general, it is not possible to reconstruct the lost MIDI command stream from the recovery journal contents. Instead, the recovery journal format codes only the information necessary for the graceful recovery from packet loss. This coding strategy trades off generality for bandwidth efficiency [6]. The recovery journal codes the history of the MWPP stream, back to an earlier packet called the checkpoint packet. The size of this checkpoint history (a precise term defined in Appendix A.1) is sent in each recovery journal. A receiver is able to detect if the checkpoint history is too shallow for a graceful recovery from a particular packet loss incident. A sender dynamically controls the size of the recovery journal by choosing the checkpoint history depth. The sender does not have other levers for dynamic control, because this memo normatively defines the length and contents of the recovery journal, given the MIDI stream contents and checkpoint history depth (static control is provided via SDP parameters, described in Section 6). Receiver designers rely on the normative nature of the journal definitions to devise recovery algorithms, much as audio and video codecs designers rely on normative bitstream definitions to act as a common media language. Senders may choose a variety of open-loop schemes for choosing a checkpoint history size for each packet: protection of a constant increment of media time, protection of a constant number of packets, maximization of protection for an average payload bandwidth, etc. These schemes share a common problem: if a receiver has sustained too many consecutive lost packets, the checkpoint history of the recovery journal may be too shallow, forcing the receiver to resort to an "ungraceful" recovery method. Lazzaro/Wawrzynek [Page 11] INTERNET-DRAFT 11 April 2002 A closed loop approach to checkpoint history management avoids this problem. Senders monitor the last RTP packet received by each receiver, via the "extended highest sequence number received" field in standard RTCP RR packets [2]. If senders do not advance the checkpoint packet to extended sequence number N until all receivers have received an MWPP packet with extended sequence number M >= (N - 1), the depth of the checkpoint history is sufficient for receivers to gracefully recover from an arbitrary packet loss. We define the term "guaranteed policy" to describe sending algorithms that obey the M >= (N - 1) inequality for the checkpoint packet. A guaranteed policy MAY use the RTCP method described above to implement its sending policy, or MAY use other means of direct feedback from receivers. We reference the guaranteed policy in the definition of the recovery journal bitfield format in Section 5. The guaranteed policy is multicast compatible, as it may be implemented via standard RTCP RR packets. However, the guarantee is only in effect for a receiver if the sender is aware of the receiver in the session. In practice, this limitation only impacts the start of a stream, as the RTP standard provides several mechanisms for a receiver to sense that a sender is aware of its presence. 5. Recovery Journal Format This section introduces the structure of the recovery journal, and defines the bitfields of recovery journal headers. Appendices A.2-8 and B.1-5 complete the bitfield definition of the recovery journal; Appendix A.1 provides normative definitions for common terms and bitfield structures used throughout the recovery journal. The recovery journal has a three-level structure: o Top-level header. o Channel and system journal headers. Encodes recovery information for a single MIDI channel (channel journal) and for all MIDI Systems commands (system journal). o Chapters. Describes recovery information for a single MIDI command type. Figure 7 shows the top-level structure of the recovery journal. A recovery journals consists of a 3-octet header, optionally followed by a system journal and a list of channel journals. Lazzaro/Wawrzynek [Page 12] INTERNET-DRAFT 11 April 2002 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|A|G|Y|TOTCHAN| Checkpoint Packet Seqnum | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... System journal ... | Channel journals ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 7 -- Top-level recovery journal format If the Y bit is set to 1, a system journal follows the recovery journal header. If the A bit is set to 1, the recovery journal ends with a list of (TOTCHAN + 1) channel journals. If A and Y are both zero, the recovery journal only contains the 3-octet header, and is considered to be an "empty" journal. The S (single-packet loss) bit appears in most recovery journal structures. It helps receivers efficiently parse the recovery journal in the common case of the loss of a single packet. Appendix A.1 defines S bit semantics. The 16-bit Checkpoint Packet Seqnum field codes the sequence number of the checkpoint packet for this journal. The choice of the checkpoint packet sets the depth of the recovery journal history, as defined in Appendix A.1. If the choice of the checkpoint packet adheres to the guaranteed policy defined in Section 4, the G ("guaranteed") bit SHOULD be set to 1. If the choice of the checkpoint packet does not adhere to the guaranteed policy, the G bit MUST be set to 0. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| CHAN |R| LENGTH |P|W|N|A|T|C|M|R| Chapters ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 8 -- Channel journal format Figure 8 shows the structure of a channel journal: a 3-octet header, followed by a list of leaf elements called channel chapters. A channel journal encodes information about MIDI commands on the MIDI channel coded by the 4-bit CHAN header field. The 10-bit LENGTH field codes the length of the channel journal; the R bit is reserved. The semantics for LENGTH and R fields are uniform throughout the recovery journal, and are defined in Appendix A.1. Lazzaro/Wawrzynek [Page 13] INTERNET-DRAFT 11 April 2002 The third octet of the channel journal header is the Table of Contents (TOC) of the channel journal. The TOC is a set of bits that encode the presence of a chapter in the journal. Each chapter contains information about a certain class of MIDI channel command: o Chapter P: MIDI Program Change (0xC) o Chapter W: MIDI Pitch Wheel (0xE) o Chapter N: MIDI NoteOff (0x8), NoteOn (0x9) o Chapter A: MIDI Poly Aftertouch (0xA) o Chapter T: MIDI Channel Aftertouch (0xD) o Chapter C: MIDI Control Change (0xB) o Chapter M: MIDI Parameter System (part of 0xB) Chapters appear in a list following the header, in order of their appearance in the TOC. Appendices A.1-8 describe the bitfield format for each chapter, and define the conditions under which a chapter type MUST appear in the recovery journal. If any chapter types are required for a channel, an associated channel journal MUST appear in the recovery journal. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|D|V|Q|E|X| LENGTH | System chapters ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 9 -- System journal format Figure 9 shows the structure of the system journal: a 2-octet header, followed by a list of system chapters. System chapters code information about a specific class of MIDI Systems command: o Chapter D: Song Select (0xF3), Tune Request (0xF6), Reset (0xFF) o Chapter V: Active Sense (0xFE) o Chapter Q: Sequencer State (0xF2, 0xF8, 0xF9, 0xFA, 0xFB, 0xFC) o Chapter E: MTC Tape Position (0xF1, 0xF0 0x7F 0xcc 0x01 0x01) o Chapter X: System Exclusive (all other 0xF0) If header bits D, V, Q, or E are set to 1, one chapter for each chapter type whose associated bit is set appears in a list following the header. The chapter ordering follows the ordering of chapter header bits in the header bitfield. If header bit X is set to 1, one or more Chapter X bitfields appear at the end of the chapter list. Appendices B.1-5 describe the bitfield format for the system chapters, and define the conditions under which a chapter type MUST appear in the recovery journal. If any system chapter type is required to appear in the recovery journal, the system journal MUST appear in the recovery Lazzaro/Wawrzynek [Page 14] INTERNET-DRAFT 11 April 2002 journal. 6. Session Description Protocol for RTP Transport This section describes Session Description Protocol (SDP) [9] definitions for MWPP transport directly over RTP. Section 7 describes the SDP definitions for MWPP transport over the MPEG-4 generic RTP payload format. The MIME name for this packetization is mwpp. The SDP rtpmap attribute is declared as: a=rtpmap: mwpp/// The integer parameter codes the sampling rate used for the RTP timestamp field, and has the units of Hz. The integer parameter codes an arbitrary identification number for the MIDI namespace (16 MIDI channels + MIDI Systems) coded by an MWPP stream. See Section 2 for details on midiport usage. The binary parameter codes the presence (1) or absence (0) of the recovery journal section in MWPP packets. For example, the following lines bind the packetization to dynamic payload number 96, and specifies an srate of 44100 Hz, a midiport value of 56, and the presence of a recovery journal in each RTP packet: m=audio 5004 RTP/AVP 96 c=IN IP4 169.229.60.64 a=rtpmap: 96 mwpp/44100/56/1 The following lines set up 32 channels of MIDI data over two MWPP streams. As is standard in RTP/AVP, each stream has its own UDP port number. Each stream has a unique midiport value, coding the independence of the MIDI namespaces of the two streams. m=audio 5004 RTP/AVP 96 c=IN IP4 169.229.60.64 a=rtpmap: 96 mwpp/44100/40/1 m=audio 5006 RTP/AVP 97 c=IN IP4 169.229.60.64 a=rtpmap: 97 mwpp/44100/41/1 The following lines set up 16 channels of MIDI transport over two MWPP streams. Note that both streams share the same midiport value, coding that the streams share the same MIDI namespace. Lazzaro/Wawrzynek [Page 15] INTERNET-DRAFT 11 April 2002 m=audio 5004 RTP/AVP 96 c=IN IP4 169.229.60.64 a=rtpmap: 96 mwpp/44100/67/1 m=audio 5006 RTP/AVP 97 c=IN IP4 169.229.60.64 a=rtpmap: 97 mwpp/44100/67/1 MWPP defines SDP format parameters to customize the semantics of the recovery journal. The presence of channel and system chapters in the recovery journal is controlled by the normative text in Appendices A.1-8 and B.1-5. These appendices use the MUST keyword to specify the conditions under which a chapter must appear in the recovery journal. The SDP format parameters chmay, chnever, and chmust act to change the inclusion conditions for chapters. The chmay parameter changes the MUST keyword conditional for chapter inclusion into a MAY. The chnever parameter specifies chapter types that must never appear in the recovery journal. The chmust parameter reaffirms the default MUST keyword for a chapter; this parameter simplifies the SDP for complex recovery journal configurations. These chmay, chnever, and chmust parameters use the following syntax: = [optional comma-separated channel list,][chapter list]; The channel list specifies the channel journals for which this parameter applies; if no channel list is provided, the parameter applies to all channel journals. The chapter list specifies the channel and system chapters for which this parameter applies, using a concatenated list of one or more upper-case letters corresponding to the chapter types. The channel list is irrelevant for system chapters. Multiple assignments to these parameters have a cumulative effect, and are applied in the order of parameter appearance. For example, the following format commands remove protection for poly and channel aftertouch commands on all channels, weaken note command protection for channels 14 and 15, and remove pitch wheel protection for all channels except channel 12: a=fmpt: 96 chnever=WTA;chmay=14,15,N;chmust=12,W; MWPP also defines SDP format parameters to configure timestamps semantics for the MIDI command section. The tsmode parameter indicates the timestamp mode, and takes on one of three symbolic values: o tsmode = comex. This mode selects the default semantics defined in Section 3. The octpos, mperiod, and linerate parameters (described below) may not be used in this mode. Lazzaro/Wawrzynek [Page 16] INTERNET-DRAFT 11 April 2002 o tsmode = async. The MWPP stream transcodes a MIDI source with implicit "time of arrival" time coding. The MWPP sender attaches nominally accurate timestamps to each MIDI command that code the time of arrival. The octpos and linerate parameters may be used with this mode; if these parameters do not appear, their values are considered undefined. o tsmode = buffer. The MWPP stream transcodes a MIDI source with implicit "time of arrival" time coding. The MWPP sender examines the MIDI source at periodic intervals, and uses the same timestamp value for encoded commands received in the interval. The octpos, mperiod, and linerate parameters may be used with this mode; if these parameters do not appear, their values are considered undefined. We now describe the secondary format parameters octpos, mperiod, and linerate. The octpos parameter associates a timestamp with the first (octpos = first) or last (octpos = last) octet of the MIDI command field. If tsmode = buffer, octpos indicates if commands split across multiple intervals use the timestamp of the first interval or the last interval in which its octets appear. The mperiod parameter sets periodic interval for tsmode = buffer. The mperiod parameter is an integer with units of microseconds. The linerate parameter describes the underlying bandwidth of the MIDI source. Linerate is an integer with units of nanoseconds, and codes the time extent of one MIDI octet in the MIDI source medium. For example, a standard MIDI cable has a linerate value of 320000 nanoseconds. The following format commands set up a buffer mode session for a MIDI cable, with a 1 ms sampling interval, and end-of-command timestamps: a=fmpt: 96 tsmode=buffer;linerate=320000;octpos=last;mperiod=1000; 7. Session Description Protocol for MPEG-4 generic transport This section describes Session Description Protocol (SDP) definitions for the MPEG-4 generic RTP payload format [3] [4] [5]. Note that MWPP as defined in this memo creates valid MPEG-4 generic RTP packets; only SDP customization is necessary. Lazzaro/Wawrzynek [Page 17] INTERNET-DRAFT 11 April 2002 The MIME name for this packetization is mpeg4-generic. The SDP rtpmap attribute is declared as: a=rtpmap: mpeg4-generic/// The definitions of srate and rj are identical to the descriptions in Section 6. All format parameters defined in Section 6 are supported. In addition, mpeg4-generic uses format parameters for transport configuration, as shown below: a=fmpt: streamtype=5; profile-level-id=15; mode=mwpp; To signal SingleSL mode, we omit the ConstantSize and SizeLength format parameters from the fmpt command. If the MPEG 4 audio codec requires configuration data be sent via SDP, AudioSpecificConfig() may be added. 8. Security Considerations Cryptographic authentication of incoming RTP and RTCP packets is highly recommended when using MWPP. Without such protections, attackers could forge MIDI commands into an ongoing streams, potentially damaging speakers and eardrums. An attacker could also craft RTP and RTCP packets to exploit known bugs in the client, and take effective control of a client machine. 9. Congestion Control MWPP has congestion control issues that are unique for an RTP audio packetization. In certain applications such as network musical performance [6], the packet rate is linked to the gestural rate of a human performer. MWPP implementations SHOULD sense the MIDI wire protocol stream for command patterns that result in excessive packet rates, and filter these streams as part of MWPP to reduce the packet rate. 10. Acknowledgements We thank the networking, media compression, and computer music community members who have contributed to the MWPP standardization effort, including Steve Casner, Robin Davies, Dominique Fober, Philippe Gentric, Phil Kerr, Young-Kwon Lim, Colin Perkins, Larry Rowe, Dave Singer, and Martijn Sipkema. Lazzaro/Wawrzynek [Page 18] INTERNET-DRAFT 11 April 2002 Appendix A.1. Recovery Journal Definitions In this Appendix, we define the terminology and the coding idioms that are used in the recovery journal bitfield descriptions in Section 5 (journal header structure), Appendices A.2-8 (channel journal chapters) and Appendices B.1-5 (system journal chapters). These descriptions assume that the recovery journal resides in an RTP packet with sequence number I ("packet I") and that the Checkpoint Packet Seqnum field in the top-level recovery journal header refers to a packet with sequence number C. Sequence number algorithms defined for the recovery journal system use modulo 2^16 arithmetic. Several bitfield coding idioms appear throughout the recovery journal system, with consistent semantics. Most recovery journal elements begin with an "S" (Single-packet loss) bit. S bits are designed to help receivers efficiently parse through the recovery journal hierarchy in the common case of the loss of a single packet. The default value of the S bit is 1. An S bit for a recovery journal element in packet I is set to 0 if the element encodes data about a MIDI command stored in the MIDI command section of packet I - 1. If an element has its S bit set to 0, all higher-level recovery journal elements that contain it also have S bits that are set to 0, including the top-level recovery journal header (Figure 7 in Section 5). Other coding idioms that appear with consistent semantics throughout the recovery journal system are described below. o R flag bit. R flag bits are reserved for future use by MWPP. Sender MUST set R bits to 0; receivers MUST ignore R bit values. o LENGTH field. All fields named LENGTH (as distinct from LEN) code the number of octets in the structure that contains it, including the header it resides in and all hierarchical levels below it. This definition simplifies parsing, as receivers may skip over the entire structure with an addition operation. We now define normative terms used to describe recovery journal semantics. o Checkpoint history. The checkpoint history of a recovery journal is the concatenation of the MIDI command sections of packets C through I - 1. The last MIDI command in MIDI command section for packet I - 1 is considered the most recent command; the first MIDI command in the MIDI command section for packet C is the oldest command. A checkpoint history with no MIDI commands is considered to be empty. The checkpoint history never contains Lazzaro/Wawrzynek [Page 19] INTERNET-DRAFT 11 April 2002 the MIDI Command section of the packet I (the packet containing the recovery journal), so if C == I, the checkpoint history is empty by definition. o Session history. The session history of a recovery journal is the concatenation of MIDI command sections from the first packet of the session up to packet I - 1. The definitions of MIDI command recency and history emptiness are the same as in the checkpoint history. The session history never contains the MIDI command section of packet I, and so the session history of the first packet in the session is empty by definition. o Active commands (default). For most types of MIDI commands, an active MIDI command is defined to be a MIDI command that does not appear before one of the following MIDI commands in the session history: System Reset (0xFF), General MIDI System Enable (0xF0 0x7E 0xcc 0x09 0x01 0xF7), General MIDI System Disable (0xF0 0x7E 0xcc 0x09 0x00 0xF7). A few types of MIDI commands use a modified meaning of active (see below). o Active commands (NoteOn, Noteoff, Poly Aftertouch). For MIDI NoteOn, NoteOff, and Poly Aftertouch commands, an active MIDI command is defined to be a MIDI command that does not appear before one of the following MIDI commands in the session history: System Reset (0xFF), General MIDI System Enable (0xF0 0x7E 0xcc 0x09 0x01 0xF7), General MIDI System Disable (0xF0 0x7E 0xcc 0x09 0x00 0xF7), MIDI Control Change number 120 (All Notes Off) or 124 (All Sound Off). o Active commands (MIDI Control Change). For MIDI Control Change commands, an active MIDI command is defined to be a MIDI command that does not appear before one of the following MIDI commands in the session history: System Reset (0xFF), General MIDI System Enable (0xF0 0x7E 0xcc 0x09 0x01 0xF7), General MIDI System Disable (0xF0 0x7E 0xcc 0x09 0x00 0xF7), MIDI Control Change number 121 (All Controllers Off). The chapter definitions in Appendices A.2-8 and B.1-5 reflect the default recovery journal behavior of MWPP. The chmay, chmust, and chnever SDP parameters modulate these definitions, as described in Section 6. Finally, we note that channel journals only encode information about MIDI commands appearing on the MIDI channel the journal protects. All references to MIDI commands in Appendices A.2-8 should be read as "MIDI commands appearing on this channel." Lazzaro/Wawrzynek [Page 20] INTERNET-DRAFT 11 April 2002 Appendix A.2. Chapter P: MIDI Program Change A channel journal MUST contain Chapter P if an active Program Change (0xC) command appears in the checkpoint history. Figure A.2.1 shows the format for Chapter P. 0 1 2 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| PROGRAM |C| BANK-COARSE |F| BANK-FINE | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.2.1 -- Chapter P Format The chapter has a fixed size of 24 bits. The PROGRAM field indicates the program value of the most recent Program Change command in the checkpoint history. By default, bits 8-23 of Chapter P are set to 0. However, if an active Control Change (0xB) command for controller 0 (Bank Select Coarse) appears before this Program Change command in the session history, the C bit is set to 1, and the BANK-COARSE field is set to the 7-bit data value for the most recent Control Change command for controller 0. The F bit and BANK-FINE field code the Control Change command for controller 32 (Bank Select Fine) in an identical manner. Appendix A.3. Chapter W: MIDI Pitch Wheel A channel journal MUST contain Chapter W if an active MIDI Pitch Wheel (0xE) command appears in the checkpoint history. Figure A.3.1 shows the format for Chapter W. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| FIRST |R| SECOND | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.3.1 -- Chapter W Format The chapter has a fixed size of 16 bits. The FIRST and SECOND fields are the 7-bit values of the first and second data octets of the most recent active Pitch Wheel command in the checkpoint history. Lazzaro/Wawrzynek [Page 21] INTERNET-DRAFT 11 April 2002 Appendix A.4. Chapter N: MIDI NoteOff and NoteOn In this Appendix, we consider NoteOn commands with zero velocity to be NoteOff commands. A channel journal MUST contain Chapter N if an active MIDI NoteOn (0x9) or NoteOff (0x8) command appears in the checkpoint history. Figure A.4.1 shows the format for Chapter N. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |B| LEN | LOW | HIGH |S| NOTENUM |Y| VELOCITY | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NOTENUM |Y| VELOCITY | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | BITFIELD | BITFIELD | .... | BITFIELD | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.4.1 -- Chapter N Format Chapter N codes the most recent active NoteOn or NoteOff reference to a MIDI note number in the checkpoint history. Chapter N consists of a 2-octet header, followed by least one of the following data structures: o A list of note logs to code NoteOn commands. o A NoteOff bitfield structure to code NoteOff commands. The note log list MUST contain an entry for all note numbers whose most recent checkpoint history appearance is in a NoteOn command, except in cases where 128 note logs would be required (Chapter N codes a maximum of 127 note logs). The NoteOff bitfield structure MUST contain a set bit for all note numbers whose most recent checkpoint history appearance is in a NoteOff command. A note number is never coded in both structures. The header for Chapter N, reproduced in Figure A.4.2, codes the size of the note list and bitfield structures. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |B| LEN | LOW | HIGH | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.4.2 -- Chapter N Header The 7-bit LEN field codes the number of 2-octet note logs in the note list. Zero is a valid value for LEN, and codes the empty note list. Lazzaro/Wawrzynek [Page 22] INTERNET-DRAFT 11 April 2002 The 4-bit LOW and HIGH fields code the number of NoteOff bitfield octets that follow the note log list. LOW and HIGH are unsigned integer values. If LOW is less that or equal to HIGH, there are (HIGH - LOW + 1) NoteOff bitfield octets in the chapter. An empty NoteOff bitfield structure is coded by setting LOW to 15 and HIGH to 0. The B bit is set to 1 if the MIDI command section of packet I - 1 does not include a NoteOff command for this channel. The B bit, like the S bit (Appendix A.1), helps receivers efficiently parse recovery journals in the common case of the loss of a single packet. We now describe the 2-octet note log structure, reproduced in Figure A.4.3. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NOTENUM |Y| VELOCITY | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.4.3 -- Chapter N Note Log The 7-bit NOTENUM field codes the note number for the log; a note number may not be represented by multiple note logs in the note list. The 7-bit VELOCITY field codes the velocity value for the most recent NoteOn command for the note number in the checkpoint history. VELOCITY is never zero; NoteOn commands with zero velocity are coded as NoteOff commands in the NoteOff bitfield structure. The note log does not code the execution time of the NoteOn command; however, the Y bit codes information about the execution time. The Y bit is set to 1 if MIDI Command N in the MIDI command section of packet I is considered to be simultaneous with the NoteOn command coded by the note log. If the MIDI command section contains no events, Y is set to 1 if a hypothetical MIDI command occurring at the RTP timestamp time would be considered simultaneous. The definition of simultaneity is implementation dependent. We now describe the NoteOff bitfield structure. A NoteOff bitfield octet codes NoteOff information for eight consecutive MIDI note numbers, with the MSB representing the lowest note number. The MSB of the first bitfield octet codes the note number 8*LOW; the MSB of the last bitfield octet codes the note number 8*HIGH. A set bit codes a NoteOff command for the note number; Chapter N does not code NoteOff velocity data. In the most efficient coding for the NoteOff bitfield structure, the first and last octets of the structure contain at least one set bit. Lazzaro/Wawrzynek [Page 23] INTERNET-DRAFT 11 April 2002 Appendix A.5. Chapter A: MIDI Poly Aftertouch A channel journal MUST contain Chapter A if an active Poly Aftertouch (0xA) command appears in the checkpoint history. Figure A.5.1 shows the format for Chapter A. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| LEN |S| NOTENUM |R| PRESSURE |S| NOTENUM | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |R| PRESSURE | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.5.1 -- Chapter A format The chapter consists of a 1-octet header, followed by a variable length list of 2-octet note logs. A note log MUST appear for a note number if an active Poly Aftertouch command for the note number appears in the checkpoint history. A note number may not be represented by multiple note logs in the note list. The 7-bit LEN field codes the number of note logs in the list, minus one. Figure A.5.2 reproduces the note log structure of Chapter A. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NOTENUM |R| PRESSURE | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.5.2 -- Chapter A Note Log The 7-bit PRESSURE field codes the pressure value of the most recent Poly Aftertouch command in the checkpoint history. The MIDI note number for this command is coded in the 7-bit NOTENUM field. Appendix A.6. Chapter T: MIDI Channel Aftertouch A channel journal MUST contain Chapter T if an active MIDI Channel Aftertouch (0xD) command appears in the checkpoint history. Figure A.6.1 shows the format for Chapter T. Lazzaro/Wawrzynek [Page 24] INTERNET-DRAFT 11 April 2002 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |S| PRESSURE | +-+-+-+-+-+-+-+-+ Figure A.6.1 -- Chapter T Format The chapter has a fixed size of 8 bits. The 7-bit PRESSURE field holds the pressure value of the most recent active Channel Aftertouch command sent on this channel. Appendix A.7. Chapter C: MIDI Control Change A channel journal MUST contain Chapter C if an active Control Change (0xB) command appears in the checkpoint history (excepting controller numbers 0, 6, 32, 38, 96, 97, 98, 99, 100, and 101). In certain cases (defined later in this Appendix) this rule also applies to the excepted controller numbers. Figure A.7.1 shows the format for Chapter C. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| LEN |S| NUMBER |A| VALUE/ALT |S| NUMBER | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |A| VALUE/ALT | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.7.1 -- Chapter C format The chapter consists of a 1-octet header, followed by a variable length list of 2-octet controller logs. The list MUST contain an entry for a controller number if an active Control Change command for the number appears in the checkpoint history (excepting numbers 0, 6, 32, 38, 96, 97, 98, 99, 100, 101, 124, 125, 126, and 127). In certain cases (defined later in this Appendix) this rule also applies to the excepted controller numbers. The 7-bit LEN field codes the number of controller logs in the list, minus one. A controller number may not appear in multiple controller logs in the list. Figure A.7.2 reproduces the controller log structure of Chapter C. Lazzaro/Wawrzynek [Page 25] INTERNET-DRAFT 11 April 2002 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NUMBER |A| VALUE/ALT | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.7.2 -- Chapter C Controller Log The 7-bit NUMBER field identifies the controller number. The 7-bit VALUE/ALT field codes recovery information for the most recent Control Change command for this number in the checkpoint history. Chapter C provides three tools for coding recovery information for a command in the VALUE/ALT field: the value tool, the toggle tool, and the count tool. Implementations may choose among the tools to code a Control Change command. In the value tool, the 7-bit VALUE field codes the control value of the most recent Control Change command for this controller number. This tool works best for controllers that code a continuous quantity, such as number 1 (Modulation Wheel). If the value tool is chosen, the A bit is set to 0. The A bit is set to 1 to code the toggle or count tool. These tools work best for controllers that code discrete actions. Figure A.7.3 shows the controller log for these tools. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NUMBER |1|T| ALT | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.7.3 -- Controller Log for ALT tools The T flag is set to 1 to code the toggle tool; T is set to 0 to code the count tool. Both methods use the 6-bit ALT field as an unsigned integer. The toggle tools works best for controllers that act as on/off switches, such as 64 (Hold Pedal). These controllers code the "off" state with control values 0-63 and the "on" state with 64-127. The ALT field codes the total number of toggles (off->on and on->off) due to Control Change commands in the session history. Toggle counting is performed modulo 64, and the controller is assumed to be off at the start of a session. The Hold Pedal controller illustrates the benefit of the toggle tool over the value tool for switch controllers. As often used in piano Lazzaro/Wawrzynek [Page 26] INTERNET-DRAFT 11 April 2002 applications, the "on" state of the Hold Pedal lets notes resonate, while the "off" state immediately damps notes to silence. The loss of the "off" command in an "on->off->on" sequence results in ringing notes that should have been damped silent. The toggle tool lets receivers detect this lost "off" command but the value tool does not. The count tool is similar to the toggle tool, but is optimized for controllers whose value octet is ignored, such as 120 (All Notes Off). For the count tool, the ALT field codes the total number of Control Change commands in the session history. Command counting is performed modulo 64, and the command count is set to 0 at the start of the session. We now describe normative coding rules for the controller numbers that are excepted from the general rules presented in the beginning of this Appendix. For each excepted controller number, we define the conditions under which a control log MUST appear in Chapter C for the controller number. By extension, these conditions imply that Chapter C MUST appear in the recovery journal. If active Control Change commands for controller numbers 0 (Bank Select Coarse) or 32 (Bank Select Fine) appear in the checkpoint history, the most recent commands for these numbers MUST appear as entries in the controller list only if the data value for these commands are not coded in the BANK-COARSE (0) or BANK-FINE (32) fields of the Chapter P (Appendix A.2) for the channel journal. This rule avoids redundant coding in Chapters C and P. Several controller numbers pairs are defined to be mutually exclusive. Controller numbers 124 (Omni Off) and 125 (Omni On) form a mutually exclusive pair, as do controller numbers 126 (Mono) and 127 (Poly). If active Control Change commands for one or both members of a mutually exclusive pair appear in the checkpoint history, one and only one controller log MUST appear in controller list to code the pair. This controller log MUST code the controller number of the most recent Control Change command of the pair. Appendix A.8 defines Chapter M, the MIDI Parameter chapter, to provide resiliency for the MIDI registered/non-registered parameter system. Here, we define the Chapter C rules for coding Control Change commands related to the registered/non-registered parameter system. These Chapter C rules serve to minimize redundancy with Chapter M. Control Change commands for controller numbers 6 and 38 (Data Slider) and 96 and 97 (Data Button) may be used as part of the parameter system, or may be used as general-purpose controllers. Control Change commands for controller numbers 6, 38, 96, or 97 that appear in the checkpoint history, and that are used in the parameter system, MUST NOT appear as Lazzaro/Wawrzynek [Page 27] INTERNET-DRAFT 11 April 2002 entries in the controller list. However, if active Control Change commands for controller numbers 6, 38, 96, or 97 appear in the checkpoint history, and these commands are used as general-purpose controllers, the most recent general-purpose command instance for these numbers MUST appear as entries in the controller list. A parameter system transaction begins with paired Control Change commands for numbers 98 and 99 (Non-Registered Parameter LSB and MSB) or 100 and 101 (Registered Parameter LSB and MSB). Chapter M codes these paired Control Change commands. The Chapter C rule below acts to code "unpaired" commands for these controller numbers, that appear in the checkpoint history if a (98, 99) or (100, 101) pair is split across the MIDI command sections of two MWPP packets. If the most recent active Control Change command for controller 98, 99, 100, or 101 in the checkpoint history is part of a (98, 99) or (100, 101) command pair that begins a parameter system transaction, the command MUST NOT appear in the controller list. However, if the most recent active Control Change command for controller 98, 99, 100, or 101 in the checkpoint history does not form part of a (98, 99) or (100, 101) command pair, an entry MUST appear in the controller list. Appendix A.8. Chapter M: MIDI Parameter System A channel journal MUST contain Chapter M if an active Control Change command that forms part of an initiated parameter system transaction (as defined below) appears in the checkpoint history. We begin by defining the terms "parameter system", "parameter system transaction", and "initiated parameter system transaction" as used in the Appendix. o Parameter system. This phrase refers to a MIDI feature that provides two sets of 16,384 parameters to augment the Control Change controller number space. Registered Parameter Names (RPN) system and the Non-Registered Parameter Names (NRPN) system each provides 16,384 parameters. o Parameter system transaction. The value of RPNs and NRPNs are changed by a series of Control Change commands that form a transaction. A transaction begins with two Control Change commands to set the parameter number (controller numbers 98 and 99 for NRPNs, controller numbers 100 and 101 for RPNs). The transaction continues with an arbitrary number of Data Entry (controller numbers 6 and 38) and Data Button Lazzaro/Wawrzynek [Page 28] INTERNET-DRAFT 11 April 2002 (controller numbers 96 and 97) Control Change commands to set the parameter value. The transaction ends with a second pair of (98, 99) or (100, 101) Control Change commands. These terminal commands are considered a part of the transaction. In addition, the terminal commands may start a second parameter system transaction; in this case, these commands belong to two transactions. o Initiated parameter system transaction. An initiated parameter system transaction is a transaction whose (98, 99) or (100, 101) initial active Control Change command pair appears in the session history. Under certain conditions, unpaired active Control Change commands for controller numbers 98, 99, 100, or 100 are coded in Chapter C, as described in Appendix A.7. Figure A.8.1 shows the variable-length format of Chapter M. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|P|N|R|R|R| LENGTH | Transaction log list ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.8.1 Top-level Chapter M format Chapter M consists of a 2-octet header, followed by list of transaction log entries. The 10-bit LENGTH field codes the length of Chapter M, and conforms to semantics described in Appendix A.1. If an active Control Change command that forms part of an initiated parameter system transaction appears in the checkpoint history, a log entry for the transaction MUST appear in the transaction list. The relative order of transaction list entries MUST reflect the relative position of parameter transactions in the session history: the first log entry codes the most recent parameter transaction in the history, the second log entry codes a transaction that appears before the first parameter transaction in the history, etc. The P header bit is set to 1 if an active Control Change command pair to terminate the first RPN transaction in the log list does not appear in the session history. The N header bit has the same role for the first NRPN transaction in the log list. Lazzaro/Wawrzynek [Page 29] INTERNET-DRAFT 11 April 2002 Figure A.8.2 shows the structure of a transaction log. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|T| PARAM-NUMBER | KEY | DATA ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | KEY | DATA ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.8.2 Transaction Log Structure The transaction log consists of a 2-octet header, followed by a compressed enumeration of the Control Change commands for controller numbers 6, 38, 96, and 97 for this transaction in the session history. The presence of Control Change commands to terminate the transaction log are coded implicitly by the P and N header bits of the top-level chapter format (Figure A.8.1). A transaction log header codes the parameter identity. If T is set to 1, the log codes an NRPN parameter; if T is set to 0, the log codes an RPN parameter. The 14-bit PARAM-NUMBER header field codes the parameter number. The KEY and DATA fields that follow log header encode the compressed enumeration of the Control Change commands for numbers 6, 38, 96, and 97. The ordering of this enumeration matches the ordering of commands in the transaction: the first transaction command appears as the first command in the enumeration, the second transaction command appears as the second command in the enumeration, etc. KEY and DATA fields always appear in pairs in the transaction log; at least one KEY-DATA pair MUST appear in a transaction log, even if no Control Change commands need to be coded. The KEY field has a fixed 1-octet size, and acts as a directory for the KEY-DATA pair; the DATA fields has a variable size of 0-3 octets. Figure A.8.3 shows the format of the KEY octet. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |S|M|IN1|IN2|IN3| +-+-+-+-+-+-+-+-+ Figure A.8.3 -- Key Octet The two-bit fields IN1, IN2, and IN3 code the appearance and meaning of the first, second, and third DATA octet that may follows the KEY octet. Lazzaro/Wawrzynek [Page 30] INTERNET-DRAFT 11 April 2002 The IN fields code the following information: o IN_k = 00. The DATA octet for this position is not present. The permitted placements of the 00 value are: IN1 = IN2 = IN3 = 00 (no DATA octets follow the KEY octet), IN2 = IN3 = 00 (one DATA octet follow the KEY octet), IN3 = 00 (two DATA octets follow the KEY octet). o IN_k = 01. Indicates an active Control Change command for controller number 6 (Data Entry Slider Coarse); the DATA octet codes the third octet of the Control Change command. o IN_k = 02. Indicates an active Control Change command for controller number 38 (Data Entry Slider Fine); the DATA octet codes the third octet of the Control Change command. o IN_k = 03. Indicates one or more active Control Change commands for controller number 96 (Data Button Increment) and/or 97 (Data Button Decrement), without an intervening Control Change command 6 or 38.The DATA octet codes the cumulative effect of the Data Button commands, as a two's complement 8-bit value: controller 96 commands increment the value by 1, controller 97 commands decrement the value by 1. The M flag is 1 if another KEY octet follows the DATA octet(s). If M is 0, another transaction log may follow the DATA octet(s), or the DATA octet(s) may mark the end of Chapter M, depending on the LENGTH field of the top-level Chapter M header shown in Figure A.8.1. In comparison with other recovery journal chapters, Chapter M is inefficient: each transaction for a parameter number in the checkpoint history is listed in the transaction list, and each Control Change command for a transaction is enumerated in a transaction log. This design decision trades off recovery journal size for design simplicity. In practice, parameter system commands rarely appear in MIDI streams, and this design design decision does not have a significant impact on MWPP bandwidth requirements. Appendix B.1. System Chapter D: Reset, Song Select, Tune Request The system journal MUST contain Chapter D if an active MIDI Reset (0xFF), MIDI Tune Request (0xF6), or MIDI Song Select (0xF3) command appears in the checkpoint history. Note that General MIDI reset commands are coded in Chapter X (Appendix B.5), not in Chapter D. Figure B.1.1 shows the variable-length format for Chapter D. Lazzaro/Wawrzynek [Page 31] INTERNET-DRAFT 11 April 2002 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|E|T|G|R|R|R|R| Command logs ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure B.1.1 -- System Chapter D Format The chapter consists of a 1-octet header, followed by one or more command logs. Header flag bits indicate the presence of command logs for the Reset (E = 1), Tune Request (T = 1), and Song Select (G = 1) commands. Command logs appear in a list following the header, in the order that their flag bits appear in the header. Figure B.1.2 shows the 1-octet command log format for the Reset and Tune Request commands. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |S| COUNT | +-+-+-+-+-+-+-+-+ Figure B.1.2 -- Command Log for Reset and Tune Request Chapter D MUST contain the Reset command log if an active Reset command appears in the checkpoint history. The 7-bit COUNT field codes the total number of Reset commands (modulo 128) present in the session history. Chapter D MUST contain the Tune Request command log if an active Tune Request command appears in the checkpoint history. The 7-bit COUNT field codes the total number of Tune Request commands (modulo 128) present in the session history. Figure B.1.3 shows the 1-octet command log format for the Song Select command. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |S| VALUE | +-+-+-+-+-+-+-+-+ Figure B.1.3 -- Song Select Command Log Format Chapter D MUST contain the Song Select command log if an active Song Select command appears in the checkpoint history. The 7-bit VALUE field codes the song number of the most recent Song Select command in the Lazzaro/Wawrzynek [Page 32] INTERNET-DRAFT 11 April 2002 checkpoint history. Appendix B.2. System Chapter V: Active Sense Command The system journal MUST contain Chapter V if an active MIDI Active Sense (0xFE) command appears in the checkpoint history. Figure B.2.1 shows the format for Chapter V. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |S| COUNT | +-+-+-+-+-+-+-+-+ Figure B.2.1 -- System Chapter V Format The 7-bit COUNT field codes the total number of Active Sense commands (modulo 128) present in the session history. Appendix B.3. System Chapter Q: Sequencer State Commands The system journal MUST contain Chapter Q if an active MIDI Song Position Pointer (0xF2), MIDI Clock (0xF8), MIDI Tick (0xF9), MIDI Start (0xFA), MIDI Continue (0xFB) or MIDI Stop (0xFC) command appears in the checkpoint history. Figure B.3.1 shows the variable-length format for Chapter Q. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|N|D|C|T|Q|TOP| CLOCK | TICKS | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | QNOTE | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+ Figure B.3.1 -- System Chapter Q Format Chapter Q encodes the most recent sequencer system state held in the session history. In a temporal sense, the fields of Chapter Q reflect system state up to (but not including) the moment encoded by the RTP timestamp of packet I. Chapter Q consists of a 1-octet header followed by several optional fields, in the order shown in Figure B.3.1. Three header bits (C, T, Lazzaro/Wawrzynek [Page 33] INTERNET-DRAFT 11 April 2002 and Q) indicate the presence of fields following the header. Two header bits (N and D) encode aspects of the sequencer system state directly. Header flag bits C, T, and Q signal the presence of the 16-bit CLOCK field (C set to 1), the 24-bit TICKS field (T set to 1) and the 24-bit QNOTE field (Q set to 1). The N header bit encodes the relative occurrence of the Start, Continue and Stop commands in the session history. If an active Start or Continue command appears most recently, N is set to 1. If an active Stop appears most recently, or if no active instances of these commands appear in the session history, N is set to 0. The D header bit encodes the presence of the downbeat. If N is set to 1, D is set to 1 if at least one Clock or Tick command follows the most recent Start or Continue command in the session history. If this condition does not hold, or if N is 0, then D is set to 0. If N is set to 0 (coding a stopped sequence), or if N is set to 1 and D is set to 0 (coding a sequence on the verge of beginning), Chapter Q MUST encode the starting song position of the sequence. The C and T header flags, the optional CLOCK (if C is set to 1) and TICKS (if T is set to 1) fields, and the TOP header field, act to code the starting song position, via the methods described below. o If C = 0 and T = 0, the starting song position is at the beginning of the song. o If C = 1 and T = 0, the 2-bit TOP header field and the 16-bit CLOCK field are combined to form the 18-bit unsigned quantity 65536*TOP + CLOCK. This value encodes the starting song position, in units of clocks (24 clocks per quarter note). Use this method if the MIDI source uses Clock commands as timing pulses. o If C = 0 and T = 1, the 24-bit TICKS field codes the starting song position, in units of milliseconds. Use this method if the MIDI source uses Tick commands as timing pulses (10 ms per Tick). The song position MUST be encoded using sub-Tick (i.e. sub-10ms) resolution. o If C = 1 and T = 1, the starting song position is the sum of the positions encoded by the CLOCK, TOP and TICKS fields, as described above. Used this method if the MIDI stream uses Tick commands as timing pulses and also uses the clock-based Song Position Pointer commands to reposition the sequence. Lazzaro/Wawrzynek [Page 34] INTERNET-DRAFT 11 April 2002 If the N and D header bits are both set to 1, the sequence is playing, and Chapter Q MUST encode the current song position in the sequence. The current song position is coded using the same fields and methods as the starting song position (see above). If the TICKS field is used to code the current song position, the field value counts time up to the moment encoded by the RTP timestamp of packet I. Chapter Q MAY encode an estimate of the current tempo, by setting the Q header bit to 1, and placing the estimated tempo value in the 24-bit QNOTE field. The QNOTE field has units of microseconds per quarter note. This memo does not define a normative algorithm for tempo estimation for the QNOTE field. Note that Q may be set to 1 even if N is set to 0, providing a method for coding current tempo while the sequence is stopped. Appendix B.4. System Chapter E: MIDI Time Code Tape Position The system journal MUST contain Chapter E if an active MIDI System Common Quarter Frame message (0xF1) or an active System Exclusive Universal Real Time MIDI Time Code Full Frame command (F0 7F cc 01 01 hr mn sc fr F7) command appears in the checkpoint history. Figure B.4.1 shows the variable-length format for Chapter E. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|Q|C|P|D|POINT| COMPLETE | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PARTIAL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure B.4.1 -- System Chapter E Format Chapter E holds information about the most recent MIDI Time Code (MTC) tape position coded in the session history. Chapter E consists of a 1-octet header followed by two optional fields, in the order shown in Figure B.4.1. Two header bits (C and P) indicate the presence of fields following the header. The other header fields (bit flags Q and D, and the 3-bit field POINT) directly code information about the tape position. MTC tape position updates may occur atomically, via the Full Frame command, or incrementally, via a series of Quarter Frame commands. The Q bit codes if the Quarter Frame (Q set to 1) or the Full Frame (Q set to 0) appears most recently in the session history. Lazzaro/Wawrzynek [Page 35] INTERNET-DRAFT 11 April 2002 At any moment in time, the session history may hold a sequence of zero or more complete MTC frame values. A partially complete MTC frame value may also appear in the session history (after the most recent complete MTC frame value, if one exists). If the session history holds a complete MTC frame, and if the Quarter Frame or Full Frame command that completes this frame encoding appears in the checkpoint history, Chapter E MUST include the 24-bit COMPLETE field to encode the frame value. The C header bit is set to 1 to signal the presence of the COMPLETE field. If a partially complete MTC frame value appears in the session history (after the most recent complete MTC frame value, if one exists), and if at least one Quarter Frame command coding this partial value appears in the checkpoint history, Chapter E MUST include the 24-bit PARTIAL field to encode the frame value in progress. The P header bit is set to 1 to signal the presence of the PARTIAL field. The D header flag bit signals the direction the tape is moving. D is set to 0 for forward or no movement; D is set to 1 for reverse movement. If Q is set to 1, the relative motion of the upper nibble of the Quarter Frame data value determines D. If Q is set to 0, the relative tape motion from its last position determines D. The 3-bit POINT field hold information about the incremental Quarter Frame encoding in the session history. If Q is set to 1, POINT codes the upper nibble of the most recent Quarter Frame data value in the session history. If Q is set to 0, POINT is reserved for future use; senders MUST set POINT to 0x0, and receivers must ignore its value. Figure B.4.2 shows the common format for the COMPLETE and PARTIAL fields. 0 1 2 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |TYP| HOURS | MINUTES | SECONDS | FRAMES | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure B.4.2 -- COMPLETE and PARTIAL format The 5-bit HOURS, 6-bit MINUTES, 6-bit SECONDS, and 5-bit FRAMES fields encode the SMPTE values encoded in Full Frame and Quarter Frame commands. The bit allocations are sufficient to encode legal SMPTE values; note that for some fields, the associated MIDI commands use larger encodings. The 2-bit TYP field encodes the SMPTE frame type, using same encoding as the Quarter Frame and Full Frame commands. Lazzaro/Wawrzynek [Page 36] INTERNET-DRAFT 11 April 2002 If used in the COMPLETE field, the TYP, HOURS, MINUTES, SECONDS, and FRAMES fields hold the most recent complete frame value, encoded by a Full Frame command or a series of 8 Quarter Frame commands in the session history. If used in the PARTIAL field, the TYP, HOURS, MINUTES, SECONDS, and FRAMES fields do not all contain valid values. Recall that the PARTIAL field encodes a partially complete SMPTE value encoded by a series of Quarter Frame commands in the session history. The size and direction of the Quarter Frame command series may be inferred from the POINT and D header values. For each bit position in Figure B.4.2, the bit contains valid data if its associated command appears in the session history; elsewise, the bit position holds 0. If a COMPLETE field represents a Quarter Frame command series, its coded value MUST include the 2-frame offset adjustment for Quarter-Frame transmission. However, the PARTIAL field MUST NOT include this offset. Appendix B.5. System Chapter X: System Exclusive MIDI System Exclusive (opcode 0xF0, abbreviation SysEx) commands may have arbitrary length. In this Appendix, we describe System Chapter X, whose encoding is optimized for the short SysEx commands that signal real-time events. Note that Chapter X is not suitable for use with the longer SysEx commands used in bulk data transport. A MIDI session that combines real-time and bulk-data functions SHOULD be sent over two MWPP streams: a bulk-data stream sent over reliable transport, and a real-time unreliable stream for shorter commands. The midiport SDP parameter (Sections 2 and 6) supports split-stream operation. We now describe Chapter X in detail. The system journal MUST contain at least one Chapter X entry if an active SysEx command (excluding the MTC Full Frame command) appears in the checkpoint history. A SysEx command is said to "appear" in the checkpoint history if the history contains a verbatim encoding of the SysEx command, or if the history contains at least one segment of the segmental encoding of the SysEx command. Note that the structure of the system journal (Figure 9 in Section 5) permits multiple entries for Chapter X. Each Chapter X entry codes information about exactly one SysEx command. The relative ordering of Chapter X entries MUST reflect the relative position of commands in the checkpoint history: the first Chapter X entry codes the most recent SysEx command in the history, the second Chapter X entry codes a SysEx command that appears before the first coded SysEx command in the history, etc. Lazzaro/Wawrzynek [Page 37] INTERNET-DRAFT 11 April 2002 Chapter X provides two tools for encoding multiple SysEx commands of the same type. Each command of a certain type may be encoded in a separate Chapter X entry (the list tool) or only the most recent command of a certain type may be encoded (the recency tool). Each active SysEx command that appears in the checkpoint history MUST be associated with a Chapter X entry via the list or recency tool (excluding the MTC Full Frame command). For each SysEx command type, an implementation may choose either coding tool. Simple implementations may use the list tool for all command types; sophisticated implementations may reduce bandwidth by using the recency tool for some command types. Figure B.5.1 shows the variable length format for System Chapter X. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|IDC|L|T| LEN | DATA ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure B.5.1 -- System Chapter X Format Chapter X consists of a 1-octet header, following by an arbitrary length DATA field. The DATA field encodes a modified version of the data octets of the SysEx command, as described below. The leading 0xF0 and trailing 0x7F SysEx octets never appear in the DATA field. If the Manufacturer ID value of the SysEx command (coded in the first octet of the MIDI command) has the values 0x00, 0x7E, or 0x7F, the DATA field begins with the second data octet of the SysEx command; for all other Manufacturer ID values, the DATA field begins with the first data octet of the SysEx command. The 2-bit IDC header field codes 0x00, 0x7E, and 0x7F ID values, using the method shown in Figure B.5.2. ----------------------------------------------------------------------- | IDC | Manufacturer ID | First DATA octet is: | |--------------------------------------|------------------------------| | 0x0 | 0x7E (Universal Real-Time) | 2nd SysEx data octet | |--------------------------------------|------------------------------| | 0x1 | 0x7F (Universal Non-Real-Time) | 2nd SysEx data octet | |--------------------------------------|------------------------------| | 0x2 | 0x00 (Extension Escape Code) | 2nd SysEx data octet | |--------------------------------------|------------------------------| | 0x3 | in the range 0x01--0x7D | 1st SysEx data octet | ----------------------------------------------------------------------| Figure B.5.2 -- IDC Header Field Encoding Lazzaro/Wawrzynek [Page 38] INTERNET-DRAFT 11 April 2002 The 3-bit LEN header field codes the exact length of short, complete SysEx commands, and signals alternative coding techniques for longer commands and truncated commands. The LEN values 0x0 through 0x5 indicate that the length of the DATA field is 1-6 octets. For these LEN values, the DATA field encodes a complete SysEx command, as a verbatim copy of the SysEx data octets (possibly skipping the first octet, as detailed in Figure B.5.2). The LEN value 0x6 indicates that the DATA field contains 7 or more octets. The DATA field encodes a complete SysEx command, as a verbatim copy of the data octets of the SysEx command (possibly skipping the first octet, as detailed in Figure B.5.2), with one exception: bit 7 (the most-significant bit) of the final data octet is set to one. This set bit implicitly codes the length of the DATA field (MIDI data octets, by definition, clear bit 7). The LEN value 0x7 indicates that the DATA field encodes a truncated SysEx command. This coding option is only to be used for SysEx commands encoded using the segmented method, for the case where not all segments appear in the session history. If LEN is 0x7, the DATA field encodes the data octets of the SysEx command segments that appear in the session history. The DATA field holds a verbatim copy of the data octets of the coded portion of the SysEx command, with two exceptions: the first octet may be skipped (as detailed in Figure B.5.2) and bit 7 (the most-significant bit) of the final coded data octet is set to one (to provide an implicit field length, as in the case where LEN is 0x6). The L and T header flags describe the coding tool used for the Chapter X bitfield. If L is set to 1 (the list tool), all SysEx commands of this type have an associated Chapter X bitfield in the system journal. If L is set to 0 (the recency tool), only the most recent SysEx command of this type has an associated Chapter X bitfield in the system journal. The T flag defines the meaning of the word "type" in the previous paragraph. The T flag has different semantics for MIDI Universal SysEx commands (Manufacturers ID 0x7E and 0x7F) and for generic SysEx commands (all other Manufacturers ID values). We first define the T flag for Universal SysEx commands. The first four data octets of Universal commands have a defined semantics in the MIDI standard; we symbolically represent these four octets as: ID cc SubID SubID1. If T is set to 0, all Universal commands with the same ID, cc, SubID, and SubID1 values are considered the same type. If T is set to 1, all Universal commands with the same ID, cc, and SubID values are considered the same type. Lazzaro/Wawrzynek [Page 39] INTERNET-DRAFT 11 April 2002 For generic SysEx commands (all Manufacturers ID values except 0x7E and 0x7F), we define the T flag as follow. The first data octet of a generic SysEx command is the Manufacturers ID; the remaining data octets may have an arbitrary organization, but often have a set of octets coding device and sub-command, followed by data octets for the command. If T is set to 0, all generic SysEx commands with the same ID value are considered to be of the same type. If T is set to 1, the SysEx command is assumed to have a device/sub-command/data organization, and all generic SysEx commands with the same ID value, device, and sub-command values are considered to be of the same type. If the SysEx command has a multi-level sub-command structure, these semantics require identical sub-command values at all levels. Appendix C. Author Addresses John Lazzaro (corresponding author) UC Berkeley CS Division 315 Soda Hall Berkeley CA 94720-1776 Email: lazzaro@cs.berkeley.edu John Wawrzynek UC Berkeley CS Division 631 Soda Hall Berkeley CA 94720-1776 Email: johnw@cs.berkeley.edu Appendix D. References [1] MIDI Manufacturers Association. The complete MIDI 1.0 detailed specification, 1996. http://www.midi.org [2] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. RFC 1889: RTP: A transport protocol for real-time applications, 1996. [3] Internet Engineering Task Force. RTP Payload Format for MPEG-4 Streams. Work in progress, draft-ietf-avt-mpeg4-multisl-02.txt. [4] Internet Engineering Task Force. Use of "RFC-generic" for MPEG-4 Elementary Streams with no SL layer. Work in progress, draft-ietf-avt-mpeg4-simple-00.txt. Lazzaro/Wawrzynek [Page 40] INTERNET-DRAFT 11 April 2002 [5] International Standards Organization. ISO 14496 MPEG-4, Part 3 (Audio) Subpart 5 (Structured Audio) 1999. [6] John Lazzaro and John Wawrzynek. A Case for Network Musical Performance. The 11th International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV 2001) June 25-26, 2001, Port Jefferson, New York. http://www.cs.berkeley.edu/~lazzaro/sa/pubs/pdf/nossdav01.pdf [7] Sfront source code release, includes a Linux networking client that implements the MIDI RTP packetization. http://www.cs.berkeley.edu/~lazzaro/sa/ [8] Dominique Fober, Yann Orlarey, Stephane Letz. Real Time Musical Events Streaming over Internet. Proceedings of the International Conference on WEB Delivering of Music 2001, pages 147-154 http://www.grame.fr/~fober/RTESP-Wedel.pdf [9] M. Handley and V. Jacobson. RFC 2327: SDP: Session Description Protocol. 1998. Appendix E. Expiration Notice This document expires October 11, 2002. Lazzaro/Wawrzynek [Page 41]