Network Working Group Danny Cohen Internet Draft Myricom expires in six months Craig Lund Mercury Computers January 1996 Proposed Specification for the MessageWay Protocol draft-msgway-protocol-spec-01.txt expires June 1996 Status of this Memo This document is an independent submission. Comments should be submitted to the msgway@myri.com mailing list. Distribution of this memo is unlimited. This document is an Internet-Draft. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months, and may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material, or to cite them other than as a "working draft" or "work in progress." To learn the current status of any Internet-Draft, please check the "1id-abstracts.txt" listing contained in the internet-drafts Shadow Directories on: ftp.is.co.za (Africa) nic.nordu.net (Europe) ds.internic.net (US East Coast) ftp.isi.edu (US West Coast) munnari.oz.au (Pacific Rim) Abstract MessageWay's goal is to move data from a "Source" (a node on a System Area Network) to a "Destination" (another node, probably on another System Area Network) at the high performance available on these SANs. Sources and Destinations can be physical things (a processor or a smart memory board). They can also be "logical" things (a group of cooperating processes). MsgWay-WG <00> MsgWay-WG [ B l a n k ] MsgWay-WG <01> MsgWay-WG D R A F T Jan-15-1995 Proposed Specification for the MessageWay Protocol ------------------- Danny Cohen, Myricom Craig Lund, Mercury Computers MsgWay-WG Part-1: MessageWay EEP Messages..............3 Part-2: MessageWay RRP Messages.............11 Part-3: MessageWay RRP Message Format.......17 Appendix-A: Enumerations........................31 Appendix-B: Example of the Mapping Process......33 Appendix-C: Example of the use of RRP...........37 Appendix-D: Glossary............................42 || Appendix-E: Acronyms and Abbreviations..........44 || Please send your comments re this draft to . MsgWay-WG <02> MsgWay-WG [ B l a n k ] MsgWay-WG <03> MsgWay-Msgs Part-1: MessageWay EEP messages ------------------------------- MessageWay is an open family of specifications for internetworking high performance System Area Networks (SANs) and high performance LANs. Even though Most modern SANs have much in common (such as high rates, || low latency, low BER, being packet networks made of point-to-point links || with flow control, and the usage of source routes), each is an island || upon itself, incapble of direct intercommunications with other SANs. || MessageWay's goal is to "internet" such SANs and high-preformance LANs. || This part describes the packets of the EEP (End-to-End Protocol) part || of the MessageWay-protocol. The packets of the RRP (Router-to-Router || Protocol) are described in Part-2. Other MessageWay layers, such as || the MessageWay Server Layer, will be described in documents TBP (To Be || Provided later). || Some basic MessageWay terminology requires explanation. MessageWay interconnects high performance System Area Networks (SANs). Each SAN contains some "nodes." At least one node in each SAN is also a MessageWay "router." All routers contain a hardware link to a corresponding router in another SAN. MessageWay's goal is to move data from a "Source" (e.g., a node on a || SAN) to a "Destination" (e.g., another node, probably on another SAN). || Sources and Destinations can be physical things (a processor or a smart || memory board). They can also be "logical" things (a group of || cooperating processes). || Within each instance of MessageWay all nodes have unique 16-bit MessageWay addresses. These nodes include sources, destinations, and routers. A system designer can assign these "MessageWay Addresses" manually. Alternatively, the optional MessageWay Server Layer provides a way to assign and discover addresses dynamically. Throughout this document "address" always means the 16-bit MessageWay address. SANs also may have MessageWay addresses, aka SAN-IDs. They are also || 16-bit quantities, sharing the address space with the nodes. These addresses, of SANs and nodes, are unique within each instance of MessageWay. || To optimize for performance, MessageWay has a data transfer mode that leverages the native message routing schemes used within the SANs. This mode uses a "Planned Transfer" paradigm. During the planning phase, a source collects information on optimal routes to a destination, expressed in the various native formats of the intervening SANs. A source later uses this information for low latency transfers to that destination. In MessageWay, the transfer phase of a Planned Transfer is called "L2-forwarding." Appendix-C shows an example of the planning phase. MsgWay-Msgs <04> MsgWay-WG MessageWay also optionally supports a more traditional data transfer mode that requires no planning. Such transfers specify the destinations by their addresses only. MessageWay calls this more traditional approach "L3-forwarding." MessageWay packets travel through SANs encapsulated inside the native packet format of each SAN. MessageWay packets get to their destinations by Level-2 (L2) forwarding, Level-3 (L3) forwarding, or a combination thereof. In L3-forwarding (similar to IP forwarding), the L2-routing through each SAN is determined upon entering that SAN by prefixing the packet with the L2-Routing-Header (such as a source route) corresponding to the destination address specified in the packet. In L2-forwarding the source prefixes the packet with all the L2-routing headers needed along the path to the destination. The MessageWay MESSAGE STRUCTURE -------------------------------- (1) Optional Sequence of L2-Routing-Headers (L2RHs) (2) EEP Header (16 bytes): (MH) Destination Address (DA) 2 bytes Priority (P) 1 byte Version/"Magic Cookie" (V) 1 byte Packet Type (PT) 2 bytes Packet Type Extension (TE) 2 bytes Endianness (E) 4 bits Padding Length (PL) 4 bits Data Length (DL) 3 bytes Source Address (SA) 2 bytes Free (F) 2 bytes (3) Optional Data (in 8-byte words) Data Block (including optional padding) (DB) 0-128 Mbytes (4) EEP Trailer (8 bytes) (MT) Error Indication (EI) 8 bytes The leading L2RHs, (1), are consumed by the SANs before reaching the destination node which receives only the EEP header, (2), the data, (3), and the EEP trailer, (4). Unlike (2) and (3) that arrive exactly as sent by the source, (4) may be modified along the way to the destination. Each MessageWay packet is first L2-forwarded (zero or more times) before being L3-forwarded (zero or more times). MessageWay headers and trailers are always in Big Endian order. The byte order of the Data Block is undefined. It is recommended for nodes to store MessageWay headers (and data) aligned on 8-byte boundaries. All the elements of MessageWay (L2RHs, EEP-headers, data, and EEP-trailers) are always multiples of 8-byte words. MsgWay-WG <05> MsgWay-Msgs MessageWay does not provide Segmentation and Reassembly (SAR). Therefore, a packet cannot exceed the minimum MTU (Maximum Transmission Unit) along its path. MessageWay does not detect errors. It only gathers error detection information from the SANs and inter-SAN links that a packet transits. THE FORMAT OF THE EEP HEADER ---------------------------- The MessageWay EEP header (MH): #0 #1 #2 #3 #4 #5 #6 #7 +--------+--------+--------+--------+--------+--------+--------+--------+ | Destination-Addr|Priority| Version| Packet-type | Type-extension |MH +---+----+--------+--------+--------+--------+--------+--------+--------+ | E | PL | Data-Length (8B-words) | Source-Address | Free | +---+----+--------+--------+--------+--------+--------+--------+--------+ 4 4 8 8 8 8 8 8 8 bits The MessageWay EEP trailer (MT): 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path | +--------+--------+--------+--------+--------+--------+--------+--------+ THE DETAILS OF THE EEP HEADER AND TRAILER ----------------------------------------- Destination Address (DA) unsigned 2 bytes This field contains the MessageWay address of the destination. Addresses are unique within each instance of MessageWay. Nodes should have addresses assigned to them. The method of assignment || addresses to MessageWay nodes is not specified here. || Examples of potentially addressable MessageWay nodes include: groups of cooperating processes, an entire MPP, or each of an MPP's many processes or processes. MsgWay-Msgs <06> MsgWay-WG All routers must have addresses so that they can exchange control and configuration packets with other routers. This reserves (about) 32K:32K for node:logical addresses. The use of 2 MSbits to indicate logical addresses would change this to 48K:16K, with the ranges 0x0040-0xBFFF and 0xC000-0xFFFD. 0x indicates || hexadecimal values (e.g., 0x0100 is 2^8=256-decimal). || "Logical Addresses" (for broadcast and for multicast groups) are also in this address space. An address is a "Logical Address" if its MSbit is 1. A few reserved address values have special static meanings. Address Ranges: 0x0000 Illegal 0x0001-0x003F Illegal, reserved for L2-Routing Headers. 0x0040-0x7FFF Legal addresses for nodes and/or SANs 0x8000-0xFFFD Logical addresses (for multicast, etc.) 0xFFFE ("Hey-You!") This address is used by routers at power up to address adjacent routers, over a point-to-point link. ("If you receive it, it's for you.") 0xFFFF (Broadcast) This address is reserved for broadcast operations if we decide to add broadcast to the specification. ("If you receive it, it's for you.") Priority (P) unsigned 1 byte All ones is the highest priority, and all zeroes the lowest. Ideally, packets with higher priority should gain access to contested resources before packets with lower priority. Implementations may ignore the Priority field. Version (V) unsigned 1 byte This field is static and in this version should be set to 0x0005. We suggest implementations always verify the value of this field to help avoid mistakenly treating random data as a MessageWay header and to prevent problems when newer versions share the same MessageWay with older ones. MsgWay-WG <07> MsgWay-Msgs Packet Type (PT) unsigned 2 bytes The intent of the PT field is to provide all the information needed for demuxing in support of multiple protocol layers (such as in support of zero copy TCP). Whereas traditional protocol layering || requires several stages of sequential demuxing, MessageWay is expected || to provide enough information to support a single combined demuxing. || PT values to support popular parallel programming APIs such as MPI || will be defined. The Enumeration Appendix (A1) defines several values || for this PT field. Types that need more than 2 bytes can use also the following 2 bytes of the Type Extension (TE) field and (if needed) also the 2 bytes of the Free (F) field. However, layered protocols cannot be ignored. The PT field can also define data blocks as containing IP, SNMP, ATM, Ethernet, and other popular layered protocols. The PT will be then used for that purpose || as done throughout the internet (e.g., "ether-types"). || For example, here are PT values a memory board will need: || || PT Meaning || ----- -------------------------------------------------------- || WRITE -- Treat the first few bytes of the Data Block as a local memory-address and write the remaining data into memory. READ -- Treat the data block as a local memory-address and a byte count. Generate a return WRITE packet containing the appropriate data. The PT field will also indicate the commands used in the MessageWay Router to Router configuration and control Protocol (RRP). We will define a special PT value that specifies that the Data Block contains an embedded MessageWay message, complete with another EEP header, and, potentially, prefixed L2-Routing-Headers. This feature will allow L3-routing to an intermediate node, followed by L2-routing from there to the final destination. Special Types RRP - MessageWay's Router-to-Router protocol (see Part-2). ERR - Error reporting packet, usually sent to the Source Address (SA, see below) in response to a MessageWay message that could not be properly handled, such as "Destination Unknown." The TE indicates the nature of the error (e.g., UNK) as defined in the Enumeration Appendix (A4). Type Extension (TE) unsigned 2 bytes As described above, an extension of the preceding PT field. MsgWay-Msgs <08> MsgWay-WG Endianness (E), 4 bits The idea is that if the SAN interface of the receiving-node detects Endianness that is different than its native one, then it may kick in byte-swapping hardware for N-byte words, saving much work for the receiving node. The first bit (MSbit) indicates if the message is in the Big Endian order (b0=0) or in the Little Endian order (b0=1). The next 3 bits could control hardware byte swapping, if any, which assumes that all the data is of the same length. 000: don't swap, it's 8-bit data 001: swap as if all the data is 16-bit field 010: swap as if all the data is 32-bit field 011: swap as if all the data is 64-bit field 100: swap as if all the data is 128-bit field else: illegal and reserved for future use || Pad Length (PL) unsigned integer, 4 bits The number of padding bytes that were added at the end of the data block (i.e., from the end of the data to the end of the message). PL must be between 0 and 7. (The MSbit of PL is reserved now.) Data Length (DL) unsigned 3 bytes Length of the data block (not including the EEP-header and trailer) in 8-byte words (marked as "8B-words"), including any optional padding. || Hence, the net length of the Data Block is 8*DL-PL bytes. The minimum is zero, and the maximum length is (2^24-1)*8 bytes ~ 2^27 ~ 128 Mbytes. Source Address (SA) unsigned short (16 bit) integer || This field contains the address of the packet's original source in the same format as DA (except Logical Addresses). Filling in this field is optional. A value of zero means SA is not specified. Routers may use this field to identify the sender to which error messages may be sent. Free (F) 2 bytes Left free for an application to use anyway it wants. Routers must not modify this field. This F field may serve as an extension of the packet type (PT+TE), if needed. MsgWay-WG <09> MsgWay-Msgs Data Block (DB), 8*DL-PL bytes long Left free for an application to use anyway it wants. Routers must not modify this field. If the MessageWay header begins on an 8-byte boundary, the Data Block will also begin on an 8-byte boundary. Optional padding at the end of the Data Block should similarly align the trailing EI field. It is recommended for nodes to store MessageWay headers (and data) aligned on 8-byte boundaries. The above fields are in the EEP-header. The following in the trailer: Error Indication (EI) unsigned 8 bytes We allow routers to pass packets toward their destinations before detecting transmission errors (wormhole routing). The EI field provides such routers with a means to append an error indication to the end of a packet. An all zero EI value means no error indicated. || Any non zero EI value indicates one or more errors. A router never || shifts off to the left the error indication and never clears the EI field. || The packet source will usually initialize the EI field to all zeros. However, as an alternative example, a memory board may create a packet with a non zero EI field (EI=1) that indicates a parity error was detected by the memory board. Each router does an arithmetic left shift, on the EI field, by one bit. Routers that detect transmission errors also OR a one into the LSbit (after the shift). This provides a record of which routers indicated errors. For this scheme the memory board would set EI=1 to indicate detected data errors. MessageWay L2-ROUTING OPTION ---------------------------- A MessageWay source may specify native routes, by placing the native routes before the MessageWay Header. The native routes must appear within a sequence of MessageWay L2-Routing-Headers (L2RH). The contents of the L2RH are totally SAN dependent, with the exception || of the first 2 bytes which are the Length field of the L2RH (L) || indicating the number of routing bytes, of that L2RH (not including || these 2 bytes). L should be smaller than 64. Hence, its first byte is || always 0, but the next byte is not. The total number of bytes in the || L2RH is L+2, packed in [(L+9)/8] 8-byte words (where [x] is the integer || part of x). || It is up to each SAN to provide padding, if needed, to fill the L2RH words. MsgWay-Msgs <10> MsgWay-WG Each L2RH is defined by the entity that will process it. In addition to routing information per se, it may also include demuxing information such as a local message-type. For example, over Myrinet it should end with 0x0300 which is the Myrinet-type assigned to MessageWay. Since values smaller than 64 (0x0040) are illegal for Destination Address (DA) the L-field also distinguishes between packets that start with MessageWay packet headers (that require L3-forwarding) and packets that start with L2RHs (that require L2-forwarding). The L2 header must contain enough information to allow a router to quickly create any necessary local routing headers and trailers. MessageWay implementations that support L2-forwarding must document their unique L2 header requirements. When a MessageWay message is encapsulated inside any native SAN message (Paragon and Myrinet, for example), it's up to that SAN to distinguish between it and other native packets. This is not a MessageWay issue. For example, Myrinet uses its Message-Type to recognize MessageWay messages. MessageWay-Routers on the boundaries between SANs are asked to forward || packets with either L2 or L3 routings. The former start with an L2RH || (with its first 16-bit having a value smaller than 64), whereas the || latter start with MessageWay-addresses (larger than 63). || Example of an L2RH with an SR with 11 bytes. 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | 0 | L=11 | SR01 | SR02 | SR03 | SR04 | SR05 | SR06 | L2RH +--------+--------+--------+--------+--------+--------+--------+--------+ | SR07 | SR08 | SR09 | SR10 | SR11 | xxx | xxx | xxx | +--------+--------+--------+--------+--------+--------+--------+--------+ MsgWay-WG <11> RRP-Msgs Part-2: MessageWay RRP messages ------------------------------- MessageWay is an open family of specifications for internetworking System Area Networks (SANs). This part describes RRP (Router-to-Router Protocol) part of the MessageWay-protocol. It is built on top of the MessageWay-EEP described in Part-1. The packets of the RRP are listed, || defined, and discussed in Part-3. || We introduce some new terminology within this document. A MessageWay Router will always bridge two SANs. The Router consists of three parts: the "Half Router" (HR) attached to the first SAN, the HR attached to the second SAN, and their interconnection. MessageWay does not define the nature of this interconnection. However, we believe the PCI Local Bus de facto standard will become a very popular link. This document specifies a series of options that allow system designers to deploy MessageWay routers of varying levels of intelligence. Each router is considered as a set of interconnected Half-Routers (HRs), each being a full fledged address-bearing node on some SAN. There are several implementation levels of MessageWay, for nodes and for routers. System designers may choose the level of implementation to best suit their needs. The higher the implementation level, the more interoperability and adaptability result. Node implementation levels: Level-A: Built-in L2 source routes Level-B: Built-in L3 addresses (dynamic update of first HR) Level-C: Requesting and receiving dynamic routing information Level-A nodes send messages by using L2-forwarding, by specifying SRs (in L2RHs) that are hard coded into them, without the ability to dynamically acquire or modify them. Level-B nodes have also the ability to send messages by using L3-forwarding, by specifying addresses that are hard coded into them, without the ability to dynamically acquire them. These nodes can ask HRs for the best first HR for any destination node (specified by its address) and for the SR to destination nodes. In addition they can also handle Redirect messages, telling them which HR to use for given nodes. Level-C nodes can also locate other nodes by asking HRs to provide the attributes of nodes specified by addreses, names, and/or capabilities. They also respond to such queries by reporting their own attributes. Router implementation levels: Level-A: Forwarding according to L2 source routes Level-B: Handling L3 addresses, and dynamic first HR (redirect, etc) Level-C: Supporting dynamic routing by nodes Level-D: Dynamic inter-SAN routing, mapping, and discovery RRP-Msgs <12> MsgWay-WG HRs can support nodes of same (or lower) implementation level. We design for the highest implementation levels, but expect that some instances of MessageWay will use lower implementation levels. Level-A routers support only L2-Forwarding, and do not support the planning phase of Planned Transfers. Therefore, systems which use Level-A routers must have the necessary native routes hard coded (e.g., burned into a PROM somewhere). Level-B routers also support L3-Forwarding, and advise nodes about the first HR to use for each destination. Level-C routers add the planning phase of Planned Transfers (by supporting requests for routes). Level-D routers exchange routing information with each other, allowing dynamic discovery and adaptation. ........................................................................ The basic model of MessageWay is a set of SANs (System Area Networks), each with its own conventions and protocols. The interconnection between SANs is via MessageWay-routers. A router between the SAN-A and the SAN-B is composed of two interconnected processes, each a fully fledged node on a SAN. These processes known as HRs ("Half-Routers") or "SAN-interfaces." These HRs may be implemented by two separate "boxes" with an inter-SAN communication link between them, or inside a single "multi-homed" box that is interfaced to both SANs. RRP defines (via message >structure and behavior) the interactions between HRs. RRP does not define the lower level protocols that deliver its messages (over links, or between processes in multi-homed routers). In particular, RRP does not define the inter-SAN interconnection links between the HRs that are left for mutual agreements among the implementors. These links are expected to range from serial fibers to PCI buses. A PPP-like protocol may be defined later for these links. It is assumed that each HR has a Routing Table (RT) for its own SAN (aka Local Routing Table, LRT), with (at least) the addresses of all the nodes, and the source routes to each of them from the HR. This information could be dynamic or static, even manually configured. The HRs may (or may not) perform dynamic mapping of their SANs. MsgWay-WG <13> RRP-Msgs In L2 operation under levels C and D, when a source node, NS, needs to || send a message to a destination node, ND, it asks first any of the HRs || on its [NS's] SAN for a source route (SR) from NS to ND. That HR would || (1) provide such an SR, or (2) reply with a "Redirect" message, || suggesting to ask another HR on the same SAN, or (3) report no knowledge || of ND (using the UNK error message). || || NS may ask more than one HR for the SR to the same ND and choose to use || the best of these SRs. || || In L3 operation, when a source node, NS, needs to send a message to a || destination node, ND, it sends that message to any of the HRs on its SAN, || using L2, asking for L3-forwarding to ND, using ND's MessageWay-address. || That HR would (1) forward the message toward ND, or (2) forward the || message toward ND and return a "Redirect" message, suggesting to use, || in the future, another HR on the same SAN for that destination, or || (3) report no knowledge of ND (using the UNK error message). || Under levels C and D, nodes may be located by MessageWay-addresses, names, or capabilities. NODE ATTRIBUTES --------------- Each node has: Address, Name, Capabilities, Logical-Addresses || Address: 2 bytes, flat, unique in this MessageWay || Name: flat, globally unique (e.g., IP address), arbitrary length Capabilities: regular GP node, router, MessageWay-server, NFS, || paging server, M/C server, DSP, printer, .... Some capabilities may need additional parameters (e.g., SAN-ID for routers, and resolution+colors for printers). The capabilities are defined in the Enumeration Appendix (A5). Logical-Addresses: a set of (logical) addresses to which this node requests to listen. Logical addresses designate multicast and broadcast groups. The control of the Logical-Addresses (a la IGMP) is not defined in this document. The management of logical addresses (e.g., JOIN and LEAVE) is not defined yet. RRP-Msgs <14> MsgWay-WG ROUTING TABLES (RTs) -------------------- Routing tables provide the information needed for finding SRs to destinations specified by their addresses. In Level-D they also provide means to identify nodes also by names and/or capabilities. The RTs are based on "maps" for SANs prepared by local nodes on the SANs. The inter-SAN routing process depends on the exchange of these maps to form local and remote RTs. The attributes of an RT are: CSR Common Source Route for the entire RT MinMTU Min MTU for this RT (along the above CSR) RCVF List of Received-From addresses or SAN-IDs (history) || SN Serial Number of this RT (by RCVF) || SAN-ID ID of the SAN which this RT describes || Local-RT Node-Structures, for nodes on that SAN The Local-RT has one or more Node-Structures for each node on this SAN. These Node-Structures are of the form: Address On this SAN Logical-Addresses To which it listens [Name] Optional [Capabilities] Optional SR From the mapper to that node Each SR entry (and the CSR, too) contains Q, the quality of the SR, an unsigned 16-bit integer. The units are not defined here. It is assumed that Q is monotonic (sort of analogous to latency, hence additive) with all-0 being the best and all-1 the worst. The CSR has a MinMTU which is the minimal MTU along the entire CSR. The RCVF is the list of the addresses along which this RT was forwarded. Its entries are either HR-addresses or SAN-IDs. The RCVF could have been derived from the CSR, if only the HRs could parse the CSR and associate HR-addresses with SRs and SAN-IDs with HR-addresses, which should not be assumed. Different RTs for the same SAN may be kept. Each RT/RCVF pair has its own SN. The Node-Structure has SRs from the mapper to each node. The CSR is an || SR to the same mapper. Hence, by catenating the CSR to the beginning of the SR in the Node-Structure, an SR is derived all the way to that node. || Each SAN has a unique SAN-ID, known to the HRs on it. The SAN-IDs share || the MessageWay-address space with the nodes. Hence, a SAN-ID is also a || 16-bit entity, between 0x0040 and 0x7FFD. || RRP-Msgs <15> MsgWay-WG INTER-SAN MAPPING (Level-D) --------------------------- "Buddy-HRs" are HRs that are nodes on the same SAN. The halves of the same routers are called "twin-HRs" or "neighbors." HRs use the following procedure to exchange RTs, from which they derive routes to any node in the MessageWay. Each HR interchanges RTs only with its buddy-HRs (over their common SAN), and with its "twin-HR." || Typically, RTs arriving from the twin are forwarded to all the buddies, || and RTs arriving from any buddy are forwarded to the twin. || This document does NOT define how HRs map their own SANs and how they find their buddies. It is the job of each SAN to provide this information. The procedure starts with each HR getting an RT for its own SAN. Thereafter, whenever a new RT (including the local-RT) is received from anywhere (or changes) it is processed by: if (I am already in this RCVF) {ignore this RT} ; || if (new SN(RT,RCVF) =< old SN(RT,RCVF) ) {ignore this RT} ; || Keep this RT with: CSR = ( (me->Sender-of-RT), CSR ) Q(CSR) = Q(CSR) + q(me->Sender-of-RT) || RCVF = (me, RCVF) || SN(RT,RCVF) = SN(RT,RCVF)++ || MinMTU = Min(my-MTU, MinMTU) SAN-ID = (as received in RT) Local-RT = (as received in RT) if (The RT received from my twin-HR) {Send RT to all my buddy-HRs} if (The RT received from a buddy-HR) {Send RT only to my twin-HR} || COMMENTS: The MTU of the new RT is the minimum of its previous MTU and that of the current SAN. HRs must keep at least the best XRT to any given SAN. However, any HR || may keep any number of best-XRTs (there is no need for these numbers to || be the same among all the HRs). || The SNs belong to the RCVF. When a new RT arrives, it gets an SN which || is greater by 1 than that of the previous RT that has the same RCVF. || Hence, an RT with a higher SN should always replace an RT with a lower || SN and with the same RCVF. || An HR finds an SR (L2-route) from itself to the foreign node-D, by finding an RT that has D in it, catenating the CSR of that RT in front of the SR for D, as found in the RT. The HR also should add their Qs to get the total Q. The best SR is found by repeating this action and looking for the route with the least Q. MsgWay-WG <16> RRP-Msgs For L3-forwarding, the HR finds an RT that has D in it, and L2-forwards the packet to the first address on the RCVF of that RT. When an HR receives an HR-DOWN (or a LINK-DOWN) error message, it finds || all the RTs that it has with that downed-HR (or link) in their RCVF || fields, and deletes them, then it forwards this error message to those || nodes from which these RT were received, and to those nodes to which || these RTs were sent. || MsgWay-WG <17> RRP-Formats Part-3: MessageWay RRP Message Format ------------------------------------- The RRP messages are: RRP MESSAGE SUBTYPES -------------------- RRP- Impl'n Subtype Levels Description -------- ------ ----------------------------------------------- [GVL2] BCD Please give me L2-routes to node (address) || [L2SR] BCD Here are L2-routes to node (address) || [RDRC] BCD Redirect to node (address) via a neighbor HR(address) || [TELL] CD Please tell me about node (address, name, capabilities || [INFO] CD Info about node (address, name, capabilities) || || [HRTO] BCD Which HR should I use for node (address) || || [WRU?] CD Who/what-Are-You? || || [GVRT] D Please give me your RTs || [RTBL] D Here is an RT || || RRP also uses the following error messages: || || [ERR/UNK] BCD Destination Unknown (address) || [ERR/HRDOWN] BCD HR Down || [ERR/LKDOWN] BCD Link Down || [ERR/GENERAL]ABCD General error message || || All these messages may be sent from nodes or HRs, to nodes or HRs. || || The reply to [GVL2] is [L2SR], [RDRC], or [ERR/UNK]. || The reply to [TELL] is [INFO], or [ERR/UNK]. || The reply to [HRTO] is [RDRC], or [ERR/UNK]. || The reply to [WRU?] is [INFO]. || || [TELL] identifies a node by an address and/or a name and/or capabilities || [INFO] provides the address, name, and capabilities of that node. || || The exact format of these messages is defined in this part. RRP messages are sequences of records, each made of one or more 8B-words. || The RRP records are: RTyp Description ---- ---------------------------------- ADDR Address || NAME Name || CAPA Capability || LADR Logical Addresses || SRQR Source Route and its Quality (SR,Q) || MTUR MTU (for the previous SRQR) || RCVF RCVF || RTHD RT-Header || RRP-Formats <18> MsgWay-WG THE STRUCTURE OF THE RRP MESSAGES --------------------------------- The MessageWay Header (MH) designates the MessageWay-header (two 8B-words). || All the following messages include both a MessageWay-header (MH) and the || error-indication (EI) field, as their MessageWay trailer (MT). The MH is || repeatedly shown, but EI field is often not shown in these examples. || [GVL2] (BCD) Please give me L2-routes to node (address) MH (with [PT/TE]=[RRP/GVL2]) || ADDR (address of the node for which SR is requested) [L2SR] (BCD) Here are L2-routes to node (address) MH (with [PT/TE]=[RRP/L2SR]) || ADDR (address of the node for which SR is provided) SRQR (SR with Q) MTUR (MTU for the above SR) This message may have several (SRQR,MTUR)s, one for each SR. [RDRC] (BCD) Redirect to node (address) via a neighbor HR (address) MH (with [PT/TE]=[RRP/RDRC]) || ADDR (address of the destination node for which Redirect is issued) ADDR (address of the HR to be used for that destination node) [TELL] (CD) Please tell me about node (address | name | capabilities) MH (with [PT/TE]=[RRP/TELL]) || ADDR (address of the node for which more information is requested) or MH (with [PT/TE]=[RRP/TELL]) || NAME (name of the node for which more information is requested) or MH (with [PT/TE]=[RRP/TELL]) || CAPA (capabilities for which nodes are requested) This message may have several CAPA's, one for each capability. [INFO] (CD) Info about node (address, name, capabilities) MH (with [PT/TE]=[RRP/INFO]) || ADDR (address of the node for which more information is requested) NAME (name of the node for which more information is requested) CAPA (capabilities for which nodes are requested) LADR (Logical-Addresses for the requested node) This message may have several CAPA's, one for each capability. || For nodes without NAME or LADR, these records are omitted. || MsgWay-WG <19> RRP-Formats [HRTO] (BCD) Which HR should I use for node (address) MH (with [PT/TE]=[RRP/HRTO]) || ADDR (address of the node for which initial HR is requested) [WRU?] (CD) Who/what-Are-You? MH (with [PT/TE]=[RRP/WRU?] and [DA]=0xFFFE) || [GVRT] (D) Please give me your RTs MH (with [PT/TE]=[RRP/GVRT]) || [RTBL] (D) Here is an RT MH (with [PT/TE]=[RRP/RTBL]) || RTHD (RT-header, SAN-ID, SN) SRQR (CSR and its Q) MTUR (MTU) RCVF (RCVF) for each node: ADDR (address) NAME (name, optional) CAPA (capabilities, optional) LADR (logical-addresses, optional) SRQR (SRQR to that node) [ERR/UNK] (BCD) Destination Unknown (address) MH (with [PT/TE]=ERROR/UNK) || ADDR (address of the Destination node for which no SR is available) [ERR/HRDOWN] (BCD) HR Down MH (with [PT/TE]=[ERROR/HRDOWN]) || ADDR (address of the Destination HR that is down) [ERR/LINKDOWN] (BCD) Link Down MH (with [PT/TE]=[ERROR/LINKDOWN]) || ADDR (address of one end of the link that is down) [ERR/GENERAL] (ABCD) MH (with [PT/TE]=[ERROR/GENERAL]) || XX (The entire message that caused that error) RRP-Formats <20> MsgWay-WG RRP RECORD FORMAT ----------------- All RRP-records start with an 8-byte header as shown below. Its first byte identifies the record type (RTyp). The second byte is the Pad-Count byte (PL) indicating the number of padding bytes. The third and the fourth bytes (RL) are the length (in 8-byte words) of the record, including the record header, hence it is always greater than zero. The rest of the header bytes depend on the record type. +--------+--------+--------+--------+--------+--------+--------+--------+ | RTyp | PL | RL | | | | | +--------+--------+--------+--------+--------+--------+--------+--------+ Some records that have an arbitrary length are "right justified" and have PL padding bytes before the data. Padding Before Data [PBD]. Some records that have an arbitrary length are "left justified" and have PL bytes after the data. Padding After Data [PAD]. In either case the total number of data bytes is: (8*RL-PL-4). Following are the RRP-records. These records are the building blocks || used to construct RRP-messages. || In the following xxx indicate bytes that are discarded. || ===> [ADDR] Node-Address Record [PBD] 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | MsgWay-Address |Addr +--------+--------+--------+--------+--------+--------+--------+--------+ If the ADDR record is followed by other records that describe the same || node (such as NAME, CAPA, LADR, SRQR, and MTUR) then the RL of the ADDR || records also covers all these records. || ===> [NAME] Node-Name Record [PAD] (e.g., a name with 9 characters: A1..A9): 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | "NAME" | PL=3 | 0 | RL=2 | A1 | A2 | A3 | A4 |Name +--------+--------+--------+--------+--------+--------+--------+--------+ | A5 | A6 | A7 | A8 | A9 | xxx | xxx | xxx | +--------+--------+--------+--------+--------+--------+--------+--------+ ===> [CAPA] Node-Capability Record [PAD] (e.g., with 9 parameter bytes): 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | "CAPA" | PL=2 | RL=2 | CC=Cx | P1 | P2 | P3 |cap +--------+--------+--------+--------+--------+--------+--------+--------+ | P4 | P5 | P6 | P7 | P8 | P9 | xxx | xxx | +--------+--------+--------+--------+--------+--------+--------+--------+ MsgWay-WG <21> RRP-Formats Byte#4 is the Capability Code, CC, followed by as many parameter bytes as needed. The capability codes are listed in the Enumeration Appendix (A5). The number of bytes used by the parameters is 8*RL-PL-5. ===> [LADR] Logical-Addresses Record [PAD] (e.g., 3 logical addresses): 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | "LADR" | PL=6 | RL=2 |Logical-Address-1|Logical-Address-2|LogAdr +--------+--------+--------+--------+--------+--------+--------+--------+ |Logical-Address-3| xxx | xxx | xxx | xxx | xxx | xxx | +--------+--------+--------+--------+--------+--------+--------+--------+ ===> [SRQR] Source-Route Record [PBD], with Q for that route. (e.g., a combined SR with 14 bytes and an SR with 6 bytes) This record carries one, or more, L2RHs (2 in the followingexample). 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | "SRQR" | PL=2 | RL=5 | xxx | xxx | Q |SR+Q +--------+--------+--------+--------+--------+--------+--------+--------+ | SR-Len=14(bytes)| SR01 | SR02 | SR03 | SR04 | SR05 | SR06 |L2RH#1 +--------+--------+--------+--------+--------+--------+--------+--------+ | SR07 | SR08 | SR09 | SR10 | SR11 | SR12 | SR13 | SR14 | +--------+--------+--------+--------+--------+--------+--------+--------+ | SR-Len=6 (bytes)| SR01 | SR02 | SR03 | SR04 | SR05 | SR06 |L2RH#2 +--------+--------+--------+--------+--------+--------+--------+--------+ | "MTUR" | PL=1 | RL=1 | xxx | MTU (in 8B-words) |MTU +--------+--------+--------+--------+--------+--------+--------+--------+ Q (the Route Quality) is an unsigned 16-bit integer. The units are not defined here. It is assumed that it is monotonic with all-0 being the best and all-1 the worst. If there is an MTUR (MTU) for that SR (as shown in the example above, but is not necessarily so), it should be || included inside the record, such that the RL of this SRQR should include also the RL of the MTU record (MTUR). ===> [MTUR] MTU record [PBD]: 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | "MTUR" | PL=1 | RL=1 | xxx | MTU (in 8B-words) |MTU +--------+--------+--------+--------+--------+--------+--------+--------+ The MTU record provides the MTU for the SR defined before (by an SRQR). The value of 0 means indefinite MTU (i.e., any length is OK). RRP-Formats <22> MsgWay-WG ===> [RCVF] RCVF Record [PAD] (e.g., 5 addresses): 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | "RCVF" | PL=2 | RL=2 | Address-1 | Address-2 |RCVF +--------+--------+--------+--------+--------+--------+--------+--------+ | Address-3 | Address-4 | Address-5 | xxx | xxx | +--------+--------+--------+--------+--------+--------+--------+--------+ The RCVF may track, along its path, either the addresses of the routers or the SAN-IDs. ===> [RTHD] RT-Header record: 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | "RTHD" | PL=0 | RL=? | SAN-ID | Serial-Number |RT-hd +--------+--------+--------+--------+--------+--------+--------+--------+ This is followed by: SRQR (CSR and its Q) MTUR (MTU) RCVF (RCVF) for each node: ADDR (address) NAME (name, optional) CAPA (capabilities, optional) LADR (logical-addresses, optional) SRQR (SR to that node, and its Q) MsgWay-WG <23> RRP-Formats RRP MESSAGE EXAMPLES -------------------- ==> [GVL2] Please give me L2-routes to node-D (address) || 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | HR-Address |Priority| Ver=5 | "R R P" | "GVL2" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=1 (in 8B-words) | S-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | D-Address |Addr +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ ==> [L2SR] Here are 2 L2-routes to node-X (address) with Qs and MTUs || (e.g., an SR of 2 L2RHs (of 5+4 bytes), and an SR an L2RH of 3 bytes) || 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | D-Address |Priority| Ver=5 | "R R P" | "L2SR" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=8 (in 8B-words)| HR-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=8 | xxx | xxx | X-Address |Addr +--------+--------+--------+--------+--------+--------+--------+--------+ | "SRQR" | PL=2 | RL=4 | xxx | xxx | Q |SR+Q +--------+--------+--------+--------+--------+--------+--------+--------+ | SR-Len=5 (bytes)| SR01 | SR02 | SR03 | SR04 | SR05 | xxx |L2RH +--------+--------+--------+--------+--------+--------+--------+--------+ | SR-Len=4 (bytes)| SR01 | SR02 | SR03 | SR04 | xxx | xxx |L2RH +--------+--------+--------+--------+--------+--------+--------+--------+ | "MTUR" | PL=1 | RL=1 | xxx | MTU (in 8B-words) |MTU +--------+--------+--------+--------+--------+--------+--------+--------+ | "SRQR" | PL=2 | RL=3 | xxx | xxx | Q |SR+Q +--------+--------+--------+--------+--------+--------+--------+--------+ | SR-Len=3 (bytes)| SR01 | SR02 | SR03 | xxx | xxx | xxx |L2RH +--------+--------+--------+--------+--------+--------+--------+--------+ | "MTUR" | PL=1 | RL=1 | xxx | MTU (in 8B-words) |MTU +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ ==> [RDRC] Redirect to node-X via another directly connected HR-H || 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | D-Address |Priority| Ver=5 | "R R P" | "RDRC" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=2 (in 8B-words) | HR-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | X-Address |Destin +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | H-Address |via-HR +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ RRP-Formats <24> MsgWay-WG ==> [TELL] Please tell me about Node-X (address | name | capabilities) || This message may have either of the following 3 forms: If by MessageWay-address: 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | HR-Address |Priority| Ver=5 | "R R P" | "TELL" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=1 (in 8B-words) | S-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | X-Address |Addr +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ If by name (e.g., a name with 9 characters: A1...A9): || 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | HR-Address |Priority| Ver=5 | "R R P" | "TELL" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=2 (in 8B-words) | S-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "NAME" | PL=3 | RL=2 | A1 | A2 | A3 | A4 |Name +--------+--------+--------+--------+--------+--------+--------+--------+ | A5 | A6 | A7 | A8 | A9 | xxx | xxx | xxx | +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ If by capabilities (e.g., 2 capabilities, C1 with 2 parameter bytes, || and C2 with no parameter bytes): || 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | HR-Address |Priority| Ver=5 | "R R P" | "TELL" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=2 (in 8B-words) | S-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "CAPA" | PL=1 | RL=1 | CC=C1 | P1 | P2 | xxx |cap +--------+--------+--------+--------+--------+--------+--------+--------+ | "CAPA" | PL=3 | RL=1 | CC=C2 | xxx | xxx | xxx |cap +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ MsgWay-WG <25> RRP-Formats ==> [INFO] Info about Node-X (address, name, capabilities) e.g., a name || with 9 characters (A1...A9) and 3 capabilities (Cx, Cy, and Cz): || 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | D-Address |Priority| Ver=5 | "R R P" | "INFO" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=7 (in 8B-words) | S-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=7 | xxx | xxx | X-Address | * +--------+--------+--------+--------+--------+--------+--------+--------+ * | "NAME" | PL=3 | RL=2 | A1 | A2 | A3 | A4 | * +--------+--------+--------+--------+--------+--------+--------+--------+ * | A5 | A6 | A7 | A8 | A9 | xxx | xxx | xxx | * +--------+--------+--------+--------+--------+--------+--------+--------+ * | "CAPA" | PL=1 | RL=1 | CC=Cx | P1 | P2 | xxx | * +--------+--------+--------+--------+--------+--------+--------+--------+ * | "CAPA" | PL=3 | RL=1 | CC=Cy | xxx | xxx | xxx | * +--------+--------+--------+--------+--------+--------+--------+--------+ * | "CAPA" | PL=5 | RL=2 | CC=Cz | P1 | P2 | P3 | * +--------+--------+--------+--------+--------+--------+--------+--------+ * | P4 | P5 | P6 | xxx | xxx | xxx | xxx | xxx | * +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ The INFO records aggregate all the nodes that meet any of the attributed || specified in the TELL record. When such aggregation is used the Length || in the MH is the sum of the RLs in all the ADDR fields. || (*) The ADDR, NAME, and CAPA records are repeated for each applicable node. || Same also for LADR, SRQR, and MTUR, if any. || If several capabilities are specified in [TELL], any node that has any of || these capabilities should be reported in [INFO]. || RRP-Formats <26> MsgWay-WG ==> [HRTO] * Which HR should I use for sending to node-D (address) || 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | HR-Address |Priority| Ver=5 | "R R P" | "HRTO" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=1 (in 8B-words) | S-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | D-Address |Destin +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ ==> [WRU?] Who/what-Are-You? || 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | 0xFFFE(Hey-You) |Priority| Ver=5 | "R R P" | "WRU?" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=0 (in 8B-words) | Source-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ ==> [GVRT] Please give me your RTs || 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | D-Address |Priority| Ver=5 | "R R P" | "GVRT" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=0 (in 8B-words) | Source-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ MsgWay-WG <27> RRP-Formats ==> [RTBL] Here is an RT || 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | D-Address |Priority| Ver=5 | "R R P" | "RTBL" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=? (in 8B-words) | HRS-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "RTHD" | PL=0 | RL=? | SAN-ID | Serial-Number |RThdr +--------+--------+--------+--------+--------+--------+--------+--------+ | "SRQR" | PL=2 | RL=5 | xxx | xxx | Q |CSR+Q +--------+--------+--------+--------+--------+--------+--------+--------+ | SR-Len=5 (bytes)| SR01 | SR02 | SR03 | SR04 | SR05 | xxx |L2RH +--------+--------+--------+--------+--------+--------+--------+--------+ | SR-Len=4 (bytes)| SR01 | SR02 | SR03 | SR04 | xxx | xxx |L2RH +--------+--------+--------+--------+--------+--------+--------+--------+ | "MTUR" | PL=1 | RL=1 | xxx | MTU (in 8B-words) |MTU +--------+--------+--------+--------+--------+--------+--------+--------+ | "RCVF" | PL=2 | RL=2 | xxx | xxx | Address-5 |RCVF +--------+--------+--------+--------+--------+--------+--------+--------+ | Address-4 | Address-3 | Address-2 | Address-1 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=10 | xxx | xxx | H-Address | * R +--------+--------+--------+--------+--------+--------+--------+--------+ * e | "NAME" | PL=3 | RL=2 | A1 | A2 | A3 | A4 | * p +--------+--------+--------+--------+--------+--------+--------+--------+ * e | A5 | A6 | A7 | A8 | A9 | xxx | xxx | xxx | * a +--------+--------+--------+--------+--------+--------+--------+--------+ * t | "CAPA" | PL=2 | RL=1 | CC=C1 | P1 | XXX | xxx | * e +--------+--------+--------+--------+--------+--------+--------+--------+ * d | "LADR" | PL=6 | RL=2 |Logical-Address-1|Logical-Address-2| * +--------+--------+--------+--------+--------+--------+--------+--------+ * 4 |Logical-Address-3| xxx | xxx | xxx | xxx | xxx | xxx | * +--------+--------+--------+--------+--------+--------+--------+--------+ * e | "SRQR" | PL=2 | RL=4 | xxx | xxx | Q | * a +--------+--------+--------+--------+--------+--------+--------+--------+ * c | SR-len=8 (bytes)| SR01 | SR02 | SR03 | SR04 | SR05 | SR06 | * h +--------+--------+--------+--------+--------+--------+--------+--------+ * | SR07 | SR08 | xxx | xxx | xxx | xxx | xxx | xxx | * n +--------+--------+--------+--------+--------+--------+--------+--------+ * o | SR-Len=5 (bytes)| SR01 | SR02 | SR03 | SR04 | SR05 | xxx | * d +--------+--------+--------+--------+--------+--------+--------+--------+ * e | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ ADDR, NAME, CAPA, LADR, and SRQR are repeated for each node in the RT. RRP-Formats <28> MsgWay-WG ==> [ERR/UNK] Destination Unknown (address) || 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | D-Address |Priority| Ver=5 | "E R R" | UNK |MH +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=1 (in 8B-words) | HRS-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | UNK-Address |Addr +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ This message reports that that host is unknown. ==> [ERR/HRDOWN] HR Down (address) || 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | D-Address |Priority| Ver=5 | "E R R" | HRDOWN |MH +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=1 (in 8B-words) | S-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | HR-Address |Addr +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ This message reports that the HR with that address is down. ==> [ERR/LINKDOWN] Link Down (2 addresses) || 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | D-Address |Priority| Ver=5 | "E R R" | LINKDOWN |MH +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=1 (in 8B-words) | S-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | Addr1 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | Addr2 | +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ This message reports that the link between Addr1 and Addr2 is down. MsgWay-WG <29> RRP-Formats ==> [ERR/GENERAL] General error || 0 1 2 3 4 5 6 7 +--------+--------+--------+--------+--------+--------+--------+--------+ | D-Address |Priority| Ver=5 | "E R R" | GENERAL |MH +--------+--------+--------+--------+--------+--------+--------+--------+ | PL=0 | Length=? (in 8B-words) | S-Address | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | |Data |<------The entire message that could not be handled by the sender----->|Data | |Data +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ This message reports that the enclosed message could not be handled by its receiver (the sender of this error message). <30> [ B l a n k ] Appendix-A <31> MsgWay-WG Appendix-A: Enumerations ------------------------ (A1) MessageWay Packet Types The EEP header reserves 6 bytes for signaling from the source node directly to the destination node. They are the PACKET TYPE (PT), TYPE EXTENSION (TE), and FREE (F), 2 bytes each. This list defines values for the PACKET-TYPE (PT) field. Each || message-type has its own interpretation of the TE and the F fields. Code Paccket Type ----------- ---------------------- 1 RRP 2 Embedded MessageWay packet 3 Memory-Read 4 Memory-Write Higher level protocols: 5 IP 6 SNMP 7 ATM 8 Ethernet 9 VME 1,024-2,047 User defined 65,535 Error Values should be assigned. "Ether-types" should be added with a pointer to those used by the Internet. (A2) RRP Messages RRP- Subtype Code Description ------- ---- ---------------------------------------------------- GVL2 1 Please give me L2-routes to node (address) || L2SR 2 Here are L2-routes to node (address) || RDRC 3 Redirect to node (address) via a neighbor HR (address) || TELL 4 Please tell about node (address | name | capabilities) || INFO 5 Info about node (address, name, capabilities) || HRTO 6 Which HR should I use for node (address) || WRU? 7 Who/what-Are-You? || GVRT 8 Please give me your RTs || RTBL 9 Here is an RT || Throughout this documents the RRP messages are indicated by their || Subtype (e.g., RDRC for Redirect). In actual messages the code is used || (e.g., 2 for RDRC). || Appendix-A <32> MsgWay-WG (A3) RRP records RTyp Code Description ------- ---- ---------------------------------------------------- ADDR 1 Node Address record || NAME 2 Node Name record || CAPA 3 Node Capability record || LADR 4 Node Logical Addresses record || SRQR 5 Source Route and its Quality (SR, Q) || MTUR 6 MTU record (for the previous SRQR) || RCVF 7 RCVF record || RTHD 8 RT-header record || Throughout this documents the records are indicated by their Rtyp (e.g., || ADDR for address). In actual messages the code is used (e.g., 1 for ADDR). || (A4) Error Messages Sybtype Code Description --------- ---- ---------------------------------------------- UNK 1 Unknown (address) HRDOWN 2 Down LINKDOWN 3 Down GENERAL 4 General error message (A5) MessageWay Node Capabilities Code Capability Parameters ---- ------------------------ -------------------------------------- 1 GP Computing Node 2 Router SAN-IDs, 2Bytes each 3 MessageWay Server 4 Network Multicast Server 5 NFS 6 NPS (Paging Server) 7 Floating -point DSP IEEE word-sizes (in bytes), 1B each 8 Fixed-point DSP word-sizes (in bytes), 1B each 9 Printer 255 SAN .......................... MsgWay-WG <33> Appendix-B Appendix-B: Example of the Mapping Process ------------------------------------------ First, the notation. Rab is the A-half of the router between the SANs A and B. It has a MessageWay-address on SAN-A. It is interconnected via some link with Rba, its twin, the HR that is on SAN-B, having a MessageWay-address on SAN-B. Hence the router between SAN-A and SAN-B is: Rab+Rba+interconnection. If there are several routers between A and B, numbers are appended to their designations (e.g., Rab2). The following map is used throughout the examples below: #21 #22 Rab Rba +-----------+ (ab) +-------------+ |H0,H1 [A] ********* [B] H2,H3| +-----*-----* +-----*---*---+ Rac* Rad* Rbd1* *Rbd2 #23* #25* #27* *#29 * * * * (ac)* (ad)* (bd1)* *(bd2) * * * * #24* #26* #28* *#30 Rca* Rda* Rdb1* *Rdb2 +-----*-----+ *-----*---*---+ +----------+ |H4,H5 [C] ********* [D] H6,H7********** [E] H8,H9| +-----------+ (cd) +-------------+ (de) +----------+ Rcd Rdc Rde Red #31 #32 #33 #34 Rab is the a-interface for a router between SAN-A and SAN-B. Each interface has its own address, marked with #nn. (ab) is a shorthand notation for Rab+Rba, interconnected via an inter-SAN connection, marked here by ***. Hn (for n=1,2,3,...) is node #n, its address is also #n. The Local-RT for SAN-A, as computed at Rax is denoted by LRT(A/ax). Initially, Rab has an LRT(A/ab) consisting of: H0 #00 node (SR(A:Rab->H0), Q) H1 #01 node (SR(A:Rab->H1), Q) Rac #23 router (SR(A:Rab->Rac), Q) Rad #25 router (SR(A:Rab->Rad), Q) Appendix-B <34> MsgWay-WG Rac has an LRT(A/ac) consisting of: H0 #00 node (SR(A:Rac->H0), Q) H1 #01 node (SR(A:Rac->H1), Q) Rab #21 router (SR(A:Rac->Rab), Q) Rad #25 router (SR(A:Rac->Rad), Q) Rad has an LRT(A/ad) consisting of: H0 #00 node (SR(A:Rad->H0), Q) H1 #01 node (SR(A:Rad->H1), Q) Rab #21 router (SR(A:Rad->Rab), Q) Rac #23 router (SR(A:Rad->Rac), Q) Rba has an LRT(B/ba) consisting of: H2 #02 node (SR(B:Rba->H2), Q) H3 #03 node (SR(B:Rba->H3), Q) Rbd1 #27 router (SR(B:Rba->Rbd1),Q) Rbd2 #29 router (SR(B:Rba->Rbd2),Q) Other HRs have similar LRTs. Later, each HR exchanges (over "***"-its interconnection) its RTs with its "other-half." In the following examples the SN and the Q entries are not shown. Rab now has: LRT(A/ab) XRT{ CSR=(*:Rab->Rba), LRT(B/ba), RCVF={Rba} } Rba has: LRT(B/ba) XRT{ CSR=(*:Rba->Rab), LRT(A/ab), RCVF={Rab} } Similarly Rac has: LRT(A/ac) XRT{ CSR=(*:Rac->Rca), LRT(C/ca), RCVF={Rca} } Rca has LRT(C/ca) XRT{ CSR=(*:Rca->Rac), LRT(A/ac), RCVF={Rac} } Rda has LRT(D/da) XRT{ CSR=(*:Rda->Rad), LRT(A/ad), RCVF={Rad} } Because Rab, Rac, Rad and are directly connected (because they are on A), they exchange their XRTs. MsgWay-WG <35> Appendix-B Now Rab has: LRT(A/ab) XRT{ CSR=(*:Rab->Rba), LRT(B/ba), RCVF={Rba} } XRT{ CSR=(A:Rab->Rac,*:Rac->Rca), LRT(C/ca), RCVF={Rac,Rca} } XRT{ CSR=(A:Rab->Rad,*:Rad->Rda), LRT(D/da), RCVF={Rad,Rda} } Rba has: LRT(B/ba) XRT{ CSR=(*:Rba->Rab), LRT(A/ab), RCVF={Rab} } XRT{ CSR=(B:Rba->Rbd1,*:Rbd1->Rdb1), LRT(D/db1),RCVF={Rbd1,Rdb1} } XRT{ CSR=(B:Rba->Rbd2,*:Rbd2->Rdb2), LRT(D/db2),RCVF={Rbd2,Rdb2} } Next, Rba will share it with Rab. Now Rab has: LRT(A/ab) XRT{CSR=(*:Rab->Rba), LRT(B/ba), RCVF={Rba} } XRT{CSR=(A:Rab->Rac,*:Rac->Rca), LRT(C/ca), RCVF={Rac,Rca} } XRT{CSR=(A:Rab->Rad,*:Rad->Rda), LRT(D/da), RCVF={Rad,Rda} } XRT{CSR=(*:Rab->Rba,B:Rba->Rbd1,*:Rbd1->Rdb1), LRT(D/db1),RCVF={Rba,Rbd1,Rdb1}} XRT{CSR=(*:Rab->Rba,B:Rba->Rbd2,*:Rbd2->Rdb2), LRT(D/db2),RCVF={Rba,Rbd2,Rdb2}} Now Rab has 3 different ways to get to nodes on D: directly through Rad, or via Rab to B, and then via either Rbd1 or Rbd2 into D. The Qs could provide preference. Rab may decide to keep all of them, or only the best (or the k best ones). Soon, Rab will also have yet another way to get to D: XRT{ CSR=(A:Rab->Rac,*:Rac->Rca,C:Rca->Rcd,*:Rcd->Rdc), LRT(D/dc), RCVF={Rac,Rca,Rcd,Rdc} } Let / represent a path between 2 buddy-HRs on the same SAN (e.g., ax/ay over the SAN a). Let \ represent a path between twin HRs (e.g., ab\ba). However, with time Rab will have the following paths to SAN-E: ab/ad\da/de \ed (ADE) ab\ba/bd1\db1/de\ed (AB1DE) ab\ba/bd2\db2/de\ed (AB2DE) ab/ac\ca/cd \dc /de\ed (ACDE) and also: ab/ad\da/db1\bd1/bd2\db2/de \ed (AD1B2DE) ab/ad\da/db2\bd2/bd1\db1/de \ed (AD2B1DE) ab/ac\ca/cd \dc /db1\bd1/bd2\db2/de\ed (ACD1B2DE) ab/ac\ca/cd \dc /db2\bd2/bd1\db1/de\ed (ACD2B1DE) ab\ba/bd1\db1/dc \cd /ca \ac /ad\da/de\ed (AB1DCADE) ab\ba/bd2\db2/dc \cd /ca \ac /ad\da/de\ed (AB2DCADE) ab\ba/bd1\db1/da \ad /ac \ca /cd\dc/de\ed (AB1DACDE) ab\ba/bd2\db2/da \ad /ac \ca /cd\dc/de\ed (AB2DACDE) Appendix-B <36> MsgWay-WG The first path is the most reasonable, the next 3 are also reasonable, but the last 8 are not reasonable. None of these paths has the same HR twice, but the latter 8 have the same SAN (D, here) twice. The latter 8 (or 11) should be ignored because of high Q values. All the last eight have D twice in their RCVFs. || ........................................................................ +-----------+ (ab) +-----------+ |H0,H1 [A] ********* [B] H2,H3| +-----*-----* +---*---*---+ * * (bd1)* *(bd2) (ac)* (ad)* * * * * * * +-----*-----+ *---*---*---+ +----------+ |H4,H5 [C] ********* [D] H6,H7********* [E] H8,H9| +-----------+ (cd) +-----------+ (de) +----------+ .............................................................................. MsgWay-WG <37> Appendix-C Appendix-C: Example of the use of RRP (using Myrinets) ------------------------------------------------------ In this example Node1 on SAN1 (with MTU=16KB) is looking for a DSP that can accommodate IEEE floating-point 64-bit data. (1) It uses L3-forwarding to ask its default router (RouterA) to provide the list of floating-point DSPs that can handle 64bit IEEE data. (2) RouterA provides the addresses of both Node2 and Node3. For its own reasons Node1 decided to use Node2. (3) Node1 asks RouterA which router to use for Node2. (4) RouterA suggests to use RouterB. (5) Node1 uses L3-forwarding to verify Node2's capabilities, by asking Node2 for information about itself. (6) Node2 provides this information which Node1 likes. (7) Node1 asks RouterB for L2RH(s) for Node2. (8) RouterB provides the requested L2RH(s) with their MTU of 8KB. Finally, (9) Node1 starts sending data to Node2 using L2-forwarding. If Node1 had only Level-A implementation then it would have the combined L2RH from itself to RouterB and from there to Node2 pre-wired, saving all this message exchange. +-------+ +--0--+ SAN1 +--0--+ +--0--+ | Node1 +----------3 SW0 1----------3 SW1 1----------3 SW2 1 MTU=16KB || +-------+ +--2--+ +--2--+ +--2--+ | | RTRA1 +----+----+ +---+---+ +----+----+ RTRB1 | RouterA | | Node2 | | RouterB | RTRA3 +----+----+ +---+---+ +----+----+ RTRB2 | | | +-------+ SAN3 +--0--+ +--0--+ SAN2 +--0--+ | Node3 +----------3 SW3 1 3 SW4 1----------3 SW5 1 MTU=8KB || +-------+ +--2--+ +--2--+ +--2--+ The sequence of messages is (Their MTs are not shown here): (1) Node1 uses L2-forwarding to send a [TELL] message asking its default || router (RouterA) to provide a list of floating-point DSPs that can || handle 64bit IEEE data. Node1 knows that RouterA is on its network, with || SR={2,MW}={2,3,0}, where MW=0x0300 is the 16-bit Myrinet-type assigned || to MessageWay. || 0 1 2 3 4 5 6 7 +-----------------------------------------------------------------------+ | <---- Here is the L2-header needed to get from Node1 to RouterA ----> | | It may be any number of bytes. In this example it is 3 bytes: {2,MW} | +--------+--------+--------+--------+--------+--------+--------+--------+ | RTRA1 |Priority| Ver=5 | "R R P" | "TELL" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ |E=0|PL=0| Length=1 (in 8B-words) | Node1 | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "CAPA" | PL=2 | RL=1 | CC=7 | 8 | xxx | xxx |64-DSP +--------+--------+--------+--------+--------+--------+--------+--------+ This asks for information about nodes with capability #7 (IEEE Floating-point DSP) with 64-bits data. Appendix-C <38> MsgWay-WG (2) RouterA uses [INFO] to provide the addresses and capabilities of both Node2 and Node3 (the former only 64 bits, the latter both 32 and 64). 0 1 2 3 4 5 6 7 +-----------------------------------------------------------------------+ | <---- Here is the L2-header needed to get from RouterA to Node1 ----> | | It may be any number of bytes. In this example it is 3 bytes: {3,MW} | +--------+--------+--------+--------+--------+--------+--------+--------+ | Node1 |Priority| Ver=5 | "R R P" | "INFO" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ |E=0|PL=0| Length=4 (in 8B-words) | RTRA1 | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=2 | xxx | xxx | Node2 |adr2 +--------+--------+--------+--------+--------+--------+--------+--------+ | "CAPA" | PL=2 | RL=1 | CC=7 | 8 | xxx | xxx |FP-DSP +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=2 | xxx | xxx | Node3 |adr3 +--------+--------+--------+--------+--------+--------+--------+--------+ | "CAPA" | PL=1 | RL=1 | CC=7 | 4 | 8 | xxx |FP-DSP +--------+--------+--------+--------+--------+--------+--------+--------+ For its own reasons Node1 decided to use Node2. (3) Node1 uses [HRTO] to ask RouterA which HR to use for node2. 0 1 2 3 4 5 6 7 +-----------------------------------------------------------------------+ | <---- Here is the L2-header needed to get from Node1 to RouterA ----> | | It may be any number of bytes. In this example it is 3 bytes: {2,MW} | +--------+--------+--------+--------+--------+--------+--------+--------+ | RTRA1 |Priority| Ver=5 | "R R P" | "HRTO" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ |E=0|PL=0| Length=1 (in 8B-words) | Node1 | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | Node2 |Destin +--------+--------+--------+--------+--------+--------+--------+--------+ (4) RouterA uses [RDRC] to redirect to Node2 via RouterB. || 0 1 2 3 4 5 6 7 +-----------------------------------------------------------------------+ | <---- Here is the L2-header needed to get from RouterA to Node1 ----> | | It may be any number of bytes. In this example it is 3 bytes: {3,MW} | +--------+--------+--------+--------+--------+--------+--------+--------+ | Node1 |Priority| Ver=5 | "R R P" | "RDRC" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ |E=0|PL=0| Length=2 (in 8B-words) | RTRA1 | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | Node2 |Destin +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | RTRB1 |via-HR +--------+--------+--------+--------+--------+--------+--------+--------+ Node1 knows how to get to RouterB over its SAN. MsgWay-WG <39> Appendix-C (5) Node1 uses [TELL] (still using L3-forwarding via RouterB) to verify Node2's capabilities, by asking Node2 for information about itself. 0 1 2 3 4 5 6 7 +-----------------------------------------------------------------------+ | <---- Here is the L2-header needed to get from Node1 to RouterB ----> | | It may be any number of bytes. Here it is 5 bytes: {1,1,2,MW} | +--------+--------+--------+--------+--------+--------+--------+--------+ | Node2 |Priority| Ver=5 | "R R P" | "TELL" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ |E=0|PL=0| Length=1 (in 8B-words) | Node1 | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | Node2 |Addr +--------+--------+--------+--------+--------+--------+--------+--------+ (6) Node2 uses [INFO] (also using L3-forwarding via RouterB) to provide more information to Node1 about Node2 than what RouterA did. 0 1 2 3 4 5 6 7 +-----------------------------------------------------------------------+ | <---- Here is the L2-header needed to get from Node2 to RouterB ----> | | It may be any number of bytes. Here it is 4 bytes: {1,0,MW} | +--------+--------+--------+--------+--------+--------+--------+--------+ | Node1 |Priority| Ver=5 | "R R P" | "INFO" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ |E=0|PL=0| Length=5 (in 8B-words) | Node2 | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=5 | xxx | xxx | Node2 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "NAME" | PL=7 | RL=2 | "S" | "u" | "p" | "e" | +--------+--------+--------+--------+--------+--------+--------+--------+ | "r" | xxx | xxx | xxx | xxx | xxx | xxx | xxx | +--------+--------+--------+--------+--------+--------+--------+--------+ | "CAPA" | PL=1 | RL=1 | CC=7 | 4 | 8 | xxx |FP-DSP +--------+--------+--------+--------+--------+--------+--------+--------+ | "CAPA" | PL=3 | RL=1 | CC=6 | xxx | xxx | xxx |NFS +--------+--------+--------+--------+--------+--------+--------+--------+ Node2 provided more information about itself, than what RouterA did, such as its name, "Super", its ability to handle also 32-bit IEEE floating point, and also being an NFS. Appendix-C <40> MsgWay-WG (7) Node1 uses [GVL2] to ask RouterB for L2RH(s) for Node2. 0 1 2 3 4 5 6 7 +-----------------------------------------------------------------------+ | <---- Here is the L2-header needed to get from Node1 to RouterB ----> | | It may be any number of bytes. Here it is 5 bytes: {1,1,2,MW} | +--------+--------+--------+--------+--------+--------+--------+--------+ | RTRA1 |Priority| Ver=5 | "R R P" | "GVL2" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ |E=0|PL=0| Length=1 (in 8B-words) | Node1 | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=1 | xxx | xxx | Node2 |Addr +--------+--------+--------+--------+--------+--------+--------+--------+ (8) RouterB uses [L2SR] to provide Node1 with an L2RH for Node2, with its Q and MTU. Here it is {3,0,MW} from RouterB to Node2. 0 1 2 3 4 5 6 7 +-----------------------------------------------------------------------+ | <---- Here is the L2-header needed to get from RouterB to Node1 ----> | | It may be any number of bytes. Here it is 5 bytes: {3,3,3,MW} | +--------+--------+--------+--------+--------+--------+--------+--------+ | Node1 |Priority| Ver=5 | "R R P" | "L2SR" |MH || +--------+--------+--------+--------+--------+--------+--------+--------+ |E=0|PL=0| Length=4 (in 8B-words)| RTRA1 | Free=0 | +--------+--------+--------+--------+--------+--------+--------+--------+ | "ADDR" | PL=2 | RL=4 | xxx | xxx | Node2 |Destin +--------+--------+--------+--------+--------+--------+--------+--------+ | "SRQR" | PL=2 | RL=3 | xxx | xxx | Q |SR+Q +--------+--------+--------+--------+--------+--------+--------+--------+ | SR-Len=4 (bytes)| 3 | 0 | 3 | 0 | xxx | xxx |L2RH +--------+--------+--------+--------+--------+--------+--------+--------+ | "MTUR" | PL=1 | RL=1 | xxx | MTU=1,024 (in 8B-words) |MTU +--------+--------+--------+--------+--------+--------+--------+--------+ The MTU in the MTUR above is the lessor of the MTUs of both networks. The RL (record-length) of the last MTUR-record is included both in the RL of the preceding SRQR-record and in the RL of the preceding ADDR-record (since the RL of the SRQR is included in the RL of the ADDR). MsgWay-WG <41> Appendix-C (9) Finally, Node1 starts sending data to Node2 using L2-forwarding. 0 1 2 3 4 5 6 7 +-----------------------------------------------------------------------+ | <---- Here is the L2-header needed to get from Node1 to RouterB ----> | | It may be any number of bytes. Here it is 5 bytes: {1,1,2,MW} | +--------+--------+--------+--------+--------+--------+--------+--------+ | SR-Len=4 (bytes)| 3 | 0 | 3 | 0 | xxx | xxx |L2RH +--------+--------+--------+--------+--------+--------+--------+--------+ | Node2 |Priority| Ver=5 | "sensor" | SubType=? |MH +--------+--------+--------+--------+--------+--------+--------+--------+ |E=3|PL=0| Length=? (in 8B-words)| Node1 | Free=? | +--------+--------+--------+--------+--------+--------+--------+--------+ | |Data | <------------------- The sensor data goes here ---------------------> |.... | |Data +--------+--------+--------+--------+--------+--------+--------+--------+ | 64 zero bits, unless any error was indicated along the path |MT +--------+--------+--------+--------+--------+--------+--------+--------+ E=3 (0011) indicates that all the data is Little Endian 64-bit. Again, if Node1 had only Level-A implementation then it would have the combined L2RH from itself to RouterB and from there to Node2 pre-wired, saving all this message exchange. Appendix-Glossary <42> MsgWay-WG Appendix-D: Glossary || -------------------- || || Address: A unique designation of a node (actually an interface to || that node) or a SAN. || || Buddy-HR: HRs are "buddies" if they are connected to the same SAN. || || Destination: The node to which a packet is intended || || Dynamic-Routing: Routing according to dynamic information || (i.e., acquired at run time, rather than pre-set). || || Endianness: The property of being Big-Endian or Little-Endian. || || Ethertype: A 16-bit designating the type of Level-3 packets carried || by a Level-2 communication system. || || HR: Half-Router, the part of a router that handles one || network only. || || L2-Forwarding: Forwarding based on Level-2 information, e.g., the || native techniques of each SAN or LAN. || || L3-forwarding: Forwarding based on end-to-end Level-3 addresses. || || Map: The topology of a network. || || Multi-homed box: A node with more than one network interface, where each || interface has another address. || || Node: Whatever can send and receive packets (e.g., a computer, || an MPP, a software process, etc.) || || Node structure: A C-struct containing values for some attributes of a node. || || Planned Transfer: Transfer of information, occurs after an initial phase || in which the sender decides which Level-2 route to use || for that transfer. || || RCVF: The "Received From" set includes all the addresses || through which an RT was disseminated. || || Redirect-message: A message that tells nodes which HR should be used in || order to get to a certain remote host. || || Router: The inter-SAN communication device || || SAN: System Area Network || || Source: The node that created a packet || || Source-Route: A Level-2 route that is chosen for a packet by its source. || MsgWay-WG <43> Appendix-Glossary Twin-HR: Two HRs are twins if they both parts of the same || inter-SAN router. || || Wormhole-routing: (aka cut-thru routing) forwarding packets out of || switches as soon as possible, without storing that || entire packet in the switch (as done in || Stop-and-forward"). || || Zero-copy TCP: A TCP system that copies data directly between the user || area and the network device, bypassing op-sys copies. || MsgWay-WG <44> Appendix-Acronyms Appendix-E: Acronyms and Abbreviations || 0xNNNN The hexadecimal number NNNN (e.g., 0x0100 is 256-decimal) || 8B 8 byte (64 bits) entity || ADDR The Address-record of RRP || API Application/Program Interface ATM Asynchronous Transmision Mode BC Byte Count (of parameters) CAPA The capability-record of RRP CC Capability Code CSR Common Source-Route DA Destination Address DB Data Block DL Data Length DSP Digital Signal Processor || E The Endianness field (in the EEP header) EEP End/End Protocol EI Error Indication F The Free field (in the EEP header) GP General Purpose GVL2 An RRP message, requesting L2 route to a given destination || GVRT An RRP message asking anHR to give its routing tables || HR Half Router HRTO An RRP message asking which HR to use for a given destination || ID Identification || IGMP Internet Group Management Protocol || INFO An RRP message providing information about nodes || IP The Internet protocol ISORM The ISO Reference Model L Length || L2 Level-2 of the ISORM (Link) L2RH Level-2 Routing Header L2SR Source Route L3 Level-3 of the ISORM (Network) LADR The Logical-addresses-record of RRP LAN Local Area Network LRT Local Routing Table LSbit Least Significant bit LSbyte Least Significant byte MH MessageWay Header MPI Message Passing Interface MPP Massively Parallel Processing system MSbit Most Significant bit MSbyte Most Significant byte MT MessageWay EEP Trailer ("tail") || MTU Maximum Transmission Unit MTUR The MTU-record of RRP NAME The name-record of RRP NFS Network File Server || P The Priority field || PAD Padding After Data PBD Padding Before Data MsgWay-WG <45> Appendix-Acronyms PCI The Peripheral Component Interconnect "standard" || PL Padding Length PPP The Point-to-Point Protocol || PROM Programmable ROM (Read-Only-Memory) || PT Packet Type Q Quality (of a path) RCVF Received-From list, or the Received-From record of RRP || RDRC A Redirect message of RRP RH Routing Header RID Record ID RL Record Length (in 8-byte words) RRP Router/Router Protocol RT-hd RT (Routing Table) header || RT Routing Table RTBL An RRP message proving a Routing Table || RTHD The Routing-Table-Header record of RRP RTyp RRP's Record Type || SA Source Address SAN System Area Network SAN-ID The 16-bit MsgWay-address of a SAN SAR Segmentation and Reassembly SN Serial Number SNMP Simple Network Management Protocol SR Source Route (always at Level-2) SRQR The Source-Route-and-Q-record of RRP TE Type Extension TELL An RRP message requesting information about nodes partially specified|| V Version || WRU? An RRP message asking its recepient to identify iteself || XRT External Routing Table xxx A padding byte draft-msgway-protocol-spec-00.txt expires June 1996 [end]