Network Working Group J. Maloy Internet-Draft Ericsson Expires: December 3, 2005 J. Hadi Salim Znyx H. Khosravi Intel F. Ansari Lucent C. Shuchi Intel June 2005 TIPC based TML for the ForCES protocol draft-maloy-tipc-tml-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on December 3, 2005. Copyright Notice Copyright (C) The Internet Society (2005). Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", Maloy, et al. Expires December 3, 2005 [Page 1] Internet-Draft TIPC June 2005 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Abstract This document describes a ForCES [ForCES] Transport Mapping layer (TML) based on the Transparent Inter Process Communication service [TIPC]. It is intended to be used when the ForCES protocol is transported over L2 carriers such as Ethernet, RapidIO or PCI- Express. TIPC has been specially designed for efficient and easy-to- use communication over L2 carriers, and is typically used to define clusters of loosely coupled nodes in such environments. Table of Contents 1. Requirements notation . . . . . . . . . . . . . . . . . . . 3 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 TIPC Summary . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Rationale for a TIPC based TML . . . . . . . . . . . . . . 5 2.3 Architectural Overview . . . . . . . . . . . . . . . . . . 5 2.4 The PL Layer . . . . . . . . . . . . . . . . . . . . . . . 6 2.5 The TML Layer . . . . . . . . . . . . . . . . . . . . . . 6 2.6 Terminology . . . . . . . . . . . . . . . . . . . . . . . 7 3. TIPC TML overview . . . . . . . . . . . . . . . . . . . . . 10 3.1 Separate Control and Data channels . . . . . . . . . . . . 10 3.1.1 Data Channel . . . . . . . . . . . . . . . . . . . . . 11 3.1.2 Control Channel . . . . . . . . . . . . . . . . . . . 12 3.1.3 Reliability . . . . . . . . . . . . . . . . . . . . . 12 3.1.4 Congestion Control . . . . . . . . . . . . . . . . . . 12 3.1.5 Security . . . . . . . . . . . . . . . . . . . . . . . 13 3.1.6 Addressing . . . . . . . . . . . . . . . . . . . . . . 13 3.1.7 Timeliness . . . . . . . . . . . . . . . . . . . . . . 18 3.1.8 Prioritization . . . . . . . . . . . . . . . . . . . . 18 3.1.9 HA Decisions . . . . . . . . . . . . . . . . . . . . . 18 3.1.10 Encapsulations Used . . . . . . . . . . . . . . . . 19 3.1.11 TML Messaging . . . . . . . . . . . . . . . . . . . 19 3.1.12 Protocol Initialization and Shutdown Model . . . . . 19 3.1.13 Protocol Initialization . . . . . . . . . . . . . . 19 3.1.14 Protocol Shutdown . . . . . . . . . . . . . . . . . 21 3.1.15 Multicast Model . . . . . . . . . . . . . . . . . . 22 3.1.16 Broadcast Model . . . . . . . . . . . . . . . . . . 25 3.1.17 Security Considerations . . . . . . . . . . . . . . 25 3.2 IANA Considerations . . . . . . . . . . . . . . . . . . . 25 3.3 Manageability . . . . . . . . . . . . . . . . . . . . . . 25 4. References . . . . . . . . . . . . . . . . . . . . . . . . . 25 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 27 Intellectual Property and Copyright Statements . . . . . . . 29 Maloy, et al. Expires December 3, 2005 [Page 2] Internet-Draft TIPC June 2005 1. Requirements notation The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Maloy, et al. Expires December 3, 2005 [Page 3] Internet-Draft TIPC June 2005 2. Introduction The ForCES (Forwarding and Control Element Separation) working group in IETF is defining the architecture and protocol for separation of control and forwarding elements in network elements such as routers. [RFC3654] and [RFC3746] define architectural and protocol requirements for the communication between CE and FE. The ForCES protocol layer [ForCES] describes the protocol specification. It is envisioned that the ForCES protocol would be independent of the interconnect technology between the CE and FE and can run over multiple transport technologies and protocol. Thus a Transport Mapping Layer (TML) has been defined in the protocol framework that will take care of mapping the protocol messages to specific transports. This document defines a TIPC based TML for the ForCES protocol layer. It also addresses all the requirements for the TML including security, reliability, etc. 2.1 TIPC Summary For reference, this section gives a brief introduction to the services provided by TIPC, as well as some basic concepts needed to understand the rest of this document. For more in-depth information, see [TIPC] TIPC is a transport protocol with selectable reliability, typically operating on top of L2 packet networks such as Ethernet. If IP- routability and RFC2309 compliant congestion control is required, the protocol can also be carried over higher-level protocols such as DCCP, TCP, or SCTP. TIPC offers the following services to its users: o A functional addressing scheme providing full addressing transparency over the whole cluster. o A topology information and subscription service, providing up-to- date information about functional and physical topology. o Lightweight, highly reactive connections reporting errors or destination unreachability within a fraction of a second. o A reliable multicast service, based on functional addressing, but using the underlying network multicast service when possible. o Acknowledged, loss-free, error-free, non-duplicated transfer of user data, both in connectionless and connection-oriented mode. Maloy, et al. Expires December 3, 2005 [Page 4] Internet-Draft TIPC June 2005 o Configurable congestion control both at bearer, link, and connection level. o Data fragmentation conforming to discovered carrier MTU size. o Bundling of multiple user messages into a single TIPC packet in situations where messages cannot be sent immediately, i.e. during network congestion. o Transparent, link-level load sharing and redundancy, through support of heterogeneous multi-homing. 2.2 Rationale for a TIPC based TML [RFC3654] states a set of basic requirements (loss-free, ordered, non-corrupted delivery of messages, congestion control,scalability etc) which are all met by TIPC. In addition, since TIPC constitutes just a thin protocol layer on top of an L2 carrier, it is very efficient when used in closed LANs, which we can assume will be a very common environment for the type of routers we discuss here. TIPC' location transparent addressing scheme also makes it particularly fit for carrying the ForCES PL protocol; the latter's addressing scheme can be directly mapped onto TIPC functional addresses, making any form of address configuration or translation in the TML layer superfluous. Furthermore, the topology subscription service provided by TIPC makes it extremely easy for both the PL layer and other functions to keep track of changes in physical and functional topology changes in the router. 2.3 Architectural Overview The reader is referred to the Framework document [RFC3746], and in particular sections 3 and 4, for architectural overview and where and how the ForCES protocol fits in. There may be some content overlap between the ForCES protocol draft [ForCES] and this section in order to provide clarity. The ForCES protocol constitutes two distinct parts: the PL and TML layer. This is depicted in the figure below. Maloy, et al. Expires December 3, 2005 [Page 5] Internet-Draft TIPC June 2005 ------------- ------------- | CE/PL | | FE/PL | | Layer | | Layer | |-------------| |-------------| | CE/TML | | FE/TML | | Layer | | Layer | |-------------| |-------------| | Transport | ForCES PL messages | Transport | | Service |<---------------------------------->| Service | ------------- encapsulated in TML packets ------------- Figure 1: Architectural view of ForCES protocol The PL layer is in fact the ForCES protocol. Its semantics and message layout are defined in [ForCES]. The TML Layer is necessary to connect two ForCES PL layers as shown in Figure 1 above. Both the PL and TML layers are standardized by the IETF. While only one PL layer is defined, different TMLs are expected to be standardized. To interoperate the TML layer at the CE and FE are expected to be of the same definition. On transmit, the PL layer delivers its messages to the TML layer. The TML layer delivers the message to the destination TML layer(s). On reception, the TML delivers the message to its destination PL layer(s). 2.4 The PL Layer The PL is common to all implementations of ForCES and is standardized by the IETF [ForCES]. The PL layer is responsible for associating an FE or CE to an NE. It is also responsible for tearing down such associations. An FE uses the PL layer to throw various subscribed-to events to the CE PL layer as well as respond to various status requests issued from the CE PL. The CE configures both the FE and associated LFBs attributes using the PL layer. In addition the CE may send various requests to the FE to activate or deactivate it, reconfigure it's HA parameterization, subscribe to specific events etc. 2.5 The TML Layer The service "Topology Information and Subscription" provides the The TML layer is essentially responsible for transport of the PL layer messages. The TML is where the issues of how to achieve transport level reliability, congestion control, multicast, ordering, etc. are handled. It is expected more than one TML will be standardized. The different TMLs each could implement things differently based on capabilities of underlying media and transport. However, since each Maloy, et al. Expires December 3, 2005 [Page 6] Internet-Draft TIPC June 2005 TML is standardized, interoperability is guaranteed as long as both endpoints support the same TML. All ForCES Protocol Layer implementations should be portable across all TMLs, because all TMLs have the same top edge semantics. 2.6 Terminology o ForCES Protocol: While there may be multiple protocols used within the overall ForCES architecture, the term "ForCES protocol" refers only to the protocol used at the Fp reference point in the ForCES Framework in RFC3746 [RFC3746]. This protocol does not apply to CE-to-CE communication, FE-to-FE communication, or to communication between FE and CE managers. Basically, the ForCES protocol works in a master-slave mode in which FEs are slaves and CEs are masters. o ForCES Protocol Layer (ForCES PL): A layer in ForCES protocol architecture that defines the ForCES protocol messages, the protocol state transfer scheme, as well as the ForCES protocol architecture itself (including requirements of ForCES TML (see below)). Specifications of ForCES PL are defined by this document. o ForCES Protocol Transport Mapping Layer (ForCES TML): A layer in ForCES protocol architecture that specifically addresses the protocol message transportation issues, such as how the protocol messages are mapped to different transport media (like TCP, IP, ATM, Ethernet, etc), and how to achieve and implement reliability, multicast, ordering, etc. This document defines an TIPC based ForCES TML. o Port: The endpoint of all TIPC user communication. On Unix it typically takes the shape of a socket. o Zone: A "super-cluster" of clusters interconnected via TIPC. o Cluster: A part of a zone where all nodes are directly interconnected (fully meshed) via TIPC. o Node: A physical computer within a cluster, identified by a TIPC address. o System Node: A node having direct links to all other system nodes in the cluster, and a TIPC address defined within a certain range. When using the term 'node' in the remainder of this document we normally mean 'system node', unless the context makes a different interpretation obvious. Maloy, et al. Expires December 3, 2005 [Page 7] Internet-Draft TIPC June 2005 o Secondary Node: A node identified by a TIPC address within a certain range, and potentially having limited physical connectivity to the rest of the cluster. Secondary nodes can communicate with all system nodes in the cluster, and vice versa, but the messages may have to pass via a system node acting as router. Secondary nodes can not communicate with each other. o Link: A signalling link connecting two nodes, performing tasks such as message transfer, sequence ordering, retransmission etc. A node pair may be interconnected by 1 or 2 parallel links, in load sharing or active/standby configuration. o Bearer: A generic term for an instance of a physical or logical transport media, such as Ethernet, ATM/AAL or DCCP. o Network Address: A TIPC internal node identifier. It is in reality a 32 bit integer, subdivided into three fields (8/12/12), representing zone, cluster and node number respectively. Normally depicted as . o Network Identity: A TIPC internal identifier, used to keep different TIPC networks separated from each other, e.g. on a LAN in a lab environment. o Location transparency, sometimes called addressing transparency, is the ability to let processes communicate within a cluster without either of them knowing the physical location of their peer. o Port Name: (or just Name) A persistent functional address identifying a port within a zone. A port may move between nodes while retaining its name. For load sharing and redundancy purposes several ports may bind to the same name. o Port Identity: A volatile address identifying a unique physical port within a zone. Once a physical port is deleted its identity will not be reused for a very long time. o Message: The unit of data delivered from one user to another, i.e. between ports. o Connection: A logical channel for passing messages between two ports. Once a connection is established no address need be indicated when sending a message from any of the endpoints. A connection also implies automatic supervision of the endpoints' existence and state. Maloy, et al. Expires December 3, 2005 [Page 8] Internet-Draft TIPC June 2005 o Message Bundling: The act of bundling several messages into one bearer level packet, typically an Ethernet frame. TIPC bundles messages e.g. during media congestion. o Message Fragmentation: Dividing a long message into several bearer-level packets, and reassembling the fragments at the receiving end. o Link Failover: Moving all traffic from a failing link/media to the remaining link, while retaining original sequence order and cardinality. o Naming Table: A TIPC internal table which keeps track of the mapping between port names and corresponding port identities. It performs an on-the-fly translation from the one to the other during the message transfer phase. o Packet: The unit of data sent over a bearer. It may contain one or more complete TIPC messages, as well as fragments of a message. Maloy, et al. Expires December 3, 2005 [Page 9] Internet-Draft TIPC June 2005 3. TIPC TML overview The TIPC TML consists of two TIPC connections between the CE and FE over which the protocol messages are exchanged. One of the connections is called the control channel, over which control messages are exchanged, the other is called data channel over which external protocol packets, such as routing packets will be exchanged. The TIPC connections will use unique server port names for each of the channels. In addition to this, this TML will use the kernel level mechanism to prioritize messages over the different channels, as provided by TIPC. Some of the rationale for this approach, as well as explanation of how it meets the TML requirements is explained below. 3.1 Separate Control and Data channels The ForCES NEs are subject to Denial of Service (DoS) attacks [Requirements Section 7 15]. A malicious system in the network can flood a ForCES NE with bogus control packets such as spurious RIP or OSPF packets in an attempt to disrupt the operation of and the communication between the CEs and FEs. In order to protect against this situation, the TML uses separate control and data channels for communication between the CEs and FEs. Maloy, et al. Expires December 3, 2005 [Page 10] Internet-Draft TIPC June 2005 CE +-------------------+ | CE: PL | +-------------------+ | CE: TML | +-------------------+ | CE: TIPC | +-------------------+ | | | | | | | . | . | . | | | | | | | . | . | . | | | | | | | . | . | . +-Cc1-----+ | | | | +-.-.-.-.-.Cdn.-+ | +-Cd1-.-.+ | . +--------Ccn---+ | | | Cc2 Cd2 | . | . | | | | +-----------+ +-----------+ +-----------+ | FE: TIPC | | FE: TIPC | . . . | FE: TIPC | +-----------+ +-----------+ +-----------+ | FE: TML | | FE: TML | | FE: TML | +-----------+ +-----------+ +-----------+ | FE: PL | | FE: PL | | FE: PL | +-----------+ +-----------+ +-----------+ FE1 FE2 FEn \-------------V------------/ Legend: ---- Cc : Reliable Unicast Control Channel between CE and FE -.-. Cd : Best Effort Unicast Data Channel between CE and FE Figure 2: CE-FE Communication Channels 3.1.1 Data Channel The data channel carries the control protocol packets such as RIP, OSPF messages as outlined in Requirements [RFC3654] section 7.10, which are carried in ForCES Packet Redirect messages [RFC3746], between the CEs and FEs. The reliability requirements for the data channel messages are different from that of the control messages [RFC3654] i.e. they don't require strict reliability in terms of retransmission, etc. However congestion control is important for the data channel because in case of DoS attacks, if an unreliable Maloy, et al. Expires December 3, 2005 [Page 11] Internet-Draft TIPC June 2005 transport such as UDP is used for the data traffic, it can more easily overflow the physical connection, overwhelming the control traffic with congestion. Thus we need a transport protocol that provides congestion control but does not necessarily provide full reliability. Therefore, the data channel is established as a connection with "best effort" properties in both directions. The channel is set up by using the port name [CETYPE_DATA,CE-id] from the FE. 3.1.2 Control Channel All the other ForCES messages, which are used for configuration/ capability exchanges, event notification, etc, are carried over the control channel. The data channel is set up only after the control channel is set up, and is mapped to a TIPC connection which is "reliable" in both directions. The control channel is set up by using the port name [CETYPE_CONTROL,CE-id] from the FE. 3.1.2.1 Multicast Channel Multicast groups are joined at the FE-side by binding to the port name [McId,FE-id]. Messages are sent from CE to a multicast group by using the port name sequence [McId,0,0xffffffff], which will automatically cover all members of the group. 3.1.3 Reliability TIPC provides the reliability (no losses, no data corruption, no re- ordering of data) required for ForCES protocol control messages. Furthermore, TIPC guarantees this property even when control traffic is transparently load shared over more than one physical media, such as two parallel Ethernets. This guarantee is valid even in transition phases when one of the networks fails or is started. Optionally, an individual socket can be set to be "best effort", meaning that all messages sent from that socket may be dropped if there is a network congestion or target node overload. 3.1.4 Congestion Control Inside a LAN, TIPC does alone provide congestion control adequate to satisfy this requirement [RFC3654]. There are three levels of congestion control, as described in sections 3.5.5,3.7.6,3.7.7 and 3.9.6 of [TIPC]. The ForCES PL may receive indication of destination socket or node congestion when setting up a channel. Once a channel is established, socket level congestion is handled transparently by the TIPC connection flow control scheme, while destination node overload will result in an aborted channel if the connection is set to "reliable". Since the direction FE->CE on the data channel is set Maloy, et al. Expires December 3, 2005 [Page 12] Internet-Draft TIPC June 2005 to "best effort", congestion or CE overload will NOT result in an aborted channel. TIPC will also inform the TML layer about the reason for such channel abortion, to help the TML decide what recovery measures to take. It is possible to use TIPC and TIPC/TML even over IP-based networks, but in such cases congestion control must be guaranteed by the carrying transport protocol, e.g. TCP or DCCP. In such cases TIPC will shortcircuit the concerned parts of it's own transport protocol layer to avoid duplicate functionality. 3.1.5 Security TIPC can only guarantee message and endpoint authenticity for closed networks, e.g. a trusted LAN or bus. Since no router can yet forward TIPC/Ethernet packets it is impossible to inject spoofed packets into such a network. When needed, additional security can be achieved by carrying TIPC over an IP-protocol with the requested properties. TLS or IPsec will both fulfil the requirements stated in [RFC3654]. 3.1.6 Addressing There are at least two possible distribution models for TML CE-FE channels. One such model assumes that there is only one set of data/ control channels between each CE-FE pair. A multiplexing/ demultiplexing step is then assumed at the PL layer in the ForCES stack. The following figure illustrates this model. Maloy, et al. Expires December 3, 2005 [Page 13] Internet-Draft TIPC June 2005 ------------------------------------------------------------------ | CE | | | | -------------- ----------------- -------------- | | | XXX LFB | | CE Protocol LFB | | YYY LFB | | | | | | | | | | | | |<------+------->X<-------+----->| | | | | | | | | | | | | | | | | | | | | | -------------- --------+-------- -------------- | | | | | | | ---------------------------------|-------------------------------- | | Control/Data/Multicast | Channels | ---------------------------------|-------------------------------- | FE | | | | | | -------------- --------+-------- -------------- | | | | | | | | | | | | | | | | | | | | | |<------+------->X<-------+----->| | | | | | | | | | | | | MMM LFB | | FE Protocol LFB | | FFF LFB | | | -------------- ----------------- -------------- | | | | | ------------------------------------------------------------------ Figure 3: Channel model with explicit multi/demultiplexing Another possible model is one where any CE LFB communicate directly with any LFB on the FE side, and vice versa, without sending the messages via any PL layer multiplexer step. In reality, it is the transport protocol itself that performs the necessary multiplexing, invisible for the upper layers. The following figure illustrates this. In this model, the Protocol LFBs only serve the role of configuring the transport protocol, on behalf of the CEM or FEM, giving all existing channel pairs equal properties, and supervise the availability of the peer FE-CE. There is one PL-protocol termination (in practice, a library) per LFB, terminating all messages from any other LFB it may communicate to. The main advantage with this model is perfomance and simplicity, but it requires a PL layer providing and assuming a connectionless communication model. Maloy, et al. Expires December 3, 2005 [Page 14] Internet-Draft TIPC June 2005 ------------------------------------------------------------------ | CE | | | | -------------- ----------------- -------------- | | | XXX LFB | | CE Protocol LFB | | YYY LFB | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ------+------- --------+-------- -------+------ | | | | | | | | | | | --------|------------------------|-----------------------|-------- | | | <---+---> Reliable | Supervised <---+---> connectionless | Control/Data unicast or | Channel pair <---+---> multicast | <---+---> | | | --------|------------------------|-----------------------|-------- | FE | | | | | | | | | | ------+------- --------+-------- -------+------ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MMM LFB | | FE Protocol LFB | | FFF LFB | | | -------------- ----------------- -------------- | | | | | ------------------------------------------------------------------ Figure 4: Channel model with direct LFB-LFB communication [ForCES] describes the first model, not taking into account that the second one is possible if there is a reliable connectionless protocol at hand. For now, we will therefore assume that model, not excluding the second should the PL/TML model be modified to open up for this at a later stage. Since TIPC has a functional addressing scheme, FE ids as well as LFB ids can be mapped directly down to TIPC port names and port name sequences. For TIPC/TML, a destination address is just an opaque 4-byte integer pair. Irrespective of the PL-layer's interpretation of that number, TIPC commits to reliably deliver messages from any sender socket using that number-pair to any destination socket bound to that same number-pair. When connection- Maloy, et al. Expires December 3, 2005 [Page 15] Internet-Draft TIPC June 2005 oriented messaging is wanted, the same address structure serves as connect address, making TIPC basically behave like TCP, with some additional properties to be activated on demand. For unicast addressing/delivery, it uses the requested TIPC connection between the CE and FE for control messaging. For multicast/broadcast addressing/delivery of control messages, this TML uses TIPC multicast between the CE to the FEs. The following example illustrates the address mapping: Maloy, et al. Expires December 3, 2005 [Page 16] Internet-Draft TIPC June 2005 ------------------------------------------------ | CE 8 | | ----------------------------------- | | | CE Protocol LFB | | | | | | | | | | | | | | | | | | | | | (PL) | tmlInit(CeId = 8) | | | | | | | | | | | | | | ------+----- TML API ----- | | | | | | | | | (TML) | bind(type = CETYPE, | | | | | inst = 8) | | | | V | | | ----------------- TIPC API -------- | | (TIPC) | ------------------------------------------------ ------------------------------------------------ | FE 5 | | (TIPC) | | ----------------- TIPC API -------- | | | A | | | | | | | | | | | | | | | | | | | (TML) | connect(type = CETYPE,| | | | | inst = 8) | | | | | | | | | ------+----- TML API ----- | | | | | | | | | (PL) | tmlOpen(CeId = 8) | | | | | | | | | | | | | FE Protocol LFB | | | ----------------------------------- | | | ------------------------------------------------ Figure 5: ForCES/PL to TIPC/TML address mapping A CE address is represented by a CeId (a 32-bit number) in the PL space. This address can be represented as a port name in the TIPC addressing space, so that the type value is set to be CETYPE (a Maloy, et al. Expires December 3, 2005 [Page 17] Internet-Draft TIPC June 2005 wellknown, reserved 32-bit number) and the instance value is the CeId. When the PL layer initiates the TML layer on the CE side, it gives it the CeId. The TML layer then creates a socket and binds it to a port name containing the CeId. On the FE side, the tmlOpen() call provides the TML with the targeted CE's identity. The TML layer uses this to construct a port name the same way as above, creates a communication socket,and connects it to the CE's socket by using the new port name. As we can see from this scenario, the CEM and FEM don't need to configure any addresses at the TML level, the addresses provided by the PL layer can be directly mapped down to corresponding TIPC addresses. 3.1.7 Timeliness Messages are delivered without any delay whatsoever over L2 networks. With Ethernet this will in practice mean a delivery time, process-to- process, in the order of 100 microseconds of a typical one-packet message. TIPC does not allow obsoleting messages. 3.1.8 Prioritization TIPC provides four message importance priorities, instead of eight, as required in [ForCES]. The rationale for requiring as much as eigth levels is weak; extensive experience from use of TIPC indicates that four levels is perfectly adequate. If it is decided that the ForCES PL must have eight levels, those will have to be mapped down 2-to-1 to the TIPC priorities by the TML layer. We suggest that the data connection is set to TIPC_LOW in both directions, while the control channel and multicast multicast sockets get the priority TIPC_MEDIUM. 3.1.9 HA Decisions L2 link failure detection and failover is handled transparently by TIPC, by moving traffic over to the redundant link when one such is available. This does not affect the PL layer, since it will have no knowledge about the lower layer links. In case of complete communication failure between CE and FE, the PL layer must be informed. Returned, non-delivered, messages will not be returned to the sending PL, but the failure reason will, as stated in [ForCES]. There is no support for heartbeat messages between peer TML layers. The availability of a peer node is supervised by TIPC, using its own heartbeat scheme, and indications of communication failure is received by the TML via the topology subscription service. Failure detection time can be configured per node (FE/CE), so a requested heartbeat interval from CEM/FEM or PL layer can be translated into a corresponding neighbour failure detection time per CE or FE. The TML is responsible for keeping the control and data communication Maloy, et al. Expires December 3, 2005 [Page 18] Internet-Draft TIPC June 2005 channels up. It however does not have the authority to decide which CE to set up the channels with. If a FE-CE communication channel goes down or connectivity is lost, the following steps are taken by the TML layer: If the error code from TIPC differs from TIPC_NO_NODE, FE TML attempts to reestablish the communication channel If the FE TML is unable to reestablish the channel (after some configured number of retries/timeout), it notifies the FE PL that the channel is down. CE TML waits for the channel to be reestablished (since only the FE can reestablish it) for some configured timeout prior to notifying the CE PL that the channel is down. CE TML waits for the channel to be reestablished (since only the FE can reestablish it) for some configured timeout prior to notifying the CE PL that the channel is down. 3.1.10 Encapsulations Used There is no further message encapsulation of control and data messages done at the TML layer. The PL generated control messages are transported as is by the TML layer. All ForCES protocol control and data messages are encapsulated with a TIPC header. 3.1.11 TML Messaging TBD. 3.1.12 Protocol Initialization and Shutdown Model In order for the peer PL Layers to communicate, the control and data channels must be set up. This section defines a model for the setup of the channels, using the TML interface defined in [TMLAPI]. In this model, the peer TML Layers may establish the control and data channels between the FE and the CE without the involvement of the PL Layers, or if desired, the PL Layer may trigger the setup of the channels; this is left as an implementation decision. Both modes may also be supported within an implementation 3.1.13 Protocol Initialization The control channel must be established between the FE TML and the CE TML for establishment of association to proceed. This channel will be used for messages related to the association setup and capability query. The data channel must be established no later than the response from the FE to the CE Topology query message. The following are the significant aspects associated with channel setup: single call by the PL layer sets up the communication channels for both control and data or distinct channels for control and data TML sets up the appropriate channels and allocates required descriptors for the channels. TML layer maintains a mapping between the Unicast Maloy, et al. Expires December 3, 2005 [Page 19] Internet-Draft TIPC June 2005 FE/CE Id and the corresponding conection. There is no need for channel descriptors to be returned to the PL layer at either the FE or the CE. The PL Layer only uses the Unicast FE/CE Id for read/ write calls and specifies the type of message (control versus data) to be read/written.When channels are setup successfully, the TML layer will have to return appropriate status that specifies which channel is setup successfully and which isn't. Figure 4 illustrates the initialization model where the PL layer via the TML API, triggers the setup of the control and data channels. FE1 PL FE1 TML CE TML CE PL | | | | \ / | | | TBD:tmlInit() | | FE | | | |<--------------| > CE Init/ Init/ < | | | | | Bootup Bootup | | | | | / \ | | | | | tmlOpen(CeId) | | | |-------------->| | | \ | |CtrlChan(Cc) Setup | | | Setup control | |~~~~~~~~~~~~~~~~~~~~~~>| | | channel if not | | FeId -> [CcDes] | | setup. TML | | | | > has mapping | |CtrlChan(Cc) Setup Rsp | | | from PL Layer | |<~~~~~~~~~~~~~~~~~~~~~~| | | Id to channel | CeId -> [CcDes] | | | descriptor and | | | | | channel if not | | FeId -> [CcDes, | | setup. TML | | CdDes] | | updates | | | | > mapping from | |DataChan(Cd) Setup Rsp | | | PL Layer | |<~~~~~~~~~~~~~~~~~~~~~~| | | Id to channel | CeId -> [CdDes] | | | descriptor and | | | | / channel type. | | | | | <-- status | | | | | | | |tmlEvent(ChUp) | |tmlEvent(ChUp) | |<--.--.--.--.--| |--.--.--.--.-->| | | | | | | Asso Setup Req | | |---------------|-----------------------|-------------->| | | Asso Setup Rsp | | |<--------------|-----------------------|---------------| | | | | | | Capability Query | | |<--------------|-----------------------|---------------| Maloy, et al. Expires December 3, 2005 [Page 20] Internet-Draft TIPC June 2005 | | Capability Query Rsp | | |---------------|-----------------------|-------------->| | | | | | | Topology Query | | |<--------------|-----------------------|---------------| |<--------------|-----------------------|---------------| | | Topology Query Rsp | | |---------------|-----------------------|-------------->| | | | | | |STEADY STATE OPERATION | Legend: PL --------> PL : Protocol layer messaging PL --------> TML: TML API TML --.--.--> PL : Events/Notifications/Upcalls TML ~~~~~~~~> TML: Internal protocol communication Figure 6: PL-controlled Protocol Initialization 3.1.14 Protocol Shutdown FE PL FE TML CE TML CE PL | | | | | |STEADY STATE OPERATION | | |<--------------|-----------------------|-------------->| | | Config Request | | |<--------------|-----------------------|---------------| | | Config Response | | |---------------|-----------------------|-------------->| | | | | | | Association Teardown | | |<--------------|-----------------------|---------------| | | | | | | | | \ |tmlClose(CeId) | | | | FE initiated: |-------------->| | | > FE specifies CE | <-- status | | | | Id associated | | | | / with channel. Figure 7: FE Initiated Shutdown Maloy, et al. Expires December 3, 2005 [Page 21] Internet-Draft TIPC June 2005 FE PL FE TML CE TML CE PL | | | | | |STEADY STATE OPERATION | | |<--------------|-----------------------|-------------->| | | Config Request | | |<--------------|-----------------------|---------------| | | Config Response | | |---------------|-----------------------|-------------->| | | | | | | Association Teardown | | |<--------------|-----------------------|---------------| | | | | | | | | \ | | |tmlClose(FeId) | | CE initiated: | | |<--------------| > FE specifies CE | <-- status | | status --> | | Id associated | | | | / with channel. Legend: PL --------> PL : Protocol layer messaging PL --------> TML: TML API TML --.--.--> PL : Events/Notifications/Upcalls TML ~~~~~~~~> TML: Internal protocol communication Figure 8: CE Initiated Shutdown 3.1.15 Multicast Model TIPC provides functional multicast, and broadcast as a special case of that, to the PL layer. This function takes advantage of any broadcast transport facility in the L2 bearer, such as Ethernet, and will use replicated unicast if such a feature is missing. Accordingly, the TIPC/TML layer provides support for multicast. In the ForCES model, support is required to multicast to the FEs from a CE; in this case, the CE is the source or root of the multicast and the FEs are the leaves. Once the unicast control channel is open, a CE may request FEs to join and leave specified multicast groups. Multicast support is CE-initiated. FEs can join a multicast group only if the CE requests them to join the group. TIPC/TML needs no mapping between PL layer IDs and channel descriptors for multicast, it can directly use the multicast group id provided by the PL layer. The following are the significant steps for adding or removing members from a multicast group: CE PL communicates with FE PL for requesting the FE to join or leave a multicast group. FE PL informs Maloy, et al. Expires December 3, 2005 [Page 22] Internet-Draft TIPC June 2005 FE TML regarding the join or leave request. FE TML creates a new socket and calls "bind()" to bind the socket to the multicast group requested. The multicast group id is used directly as type field in the bound address. FE PL responds to CE PL informing it of the status of the join or leave request. If the join or leave request was successful, CE PL informs CE TML regarding the update to the multicast group membership. There is no need for any descriptors to be returned to the PL layer at either the FE or the CE. PL Layer only uses the Multicast FE Id for write calls and specifies the type of message (control versus data) to be written A tmlWrite() on a unicast FE Id results in a unicast message being sent to the FE associated with the channel. A tmlWrite() on a multicast FE Id results in multicast messaging. The figures below illustrate multicast scenarios with 2 FEs, FE1 and FE2. In Figure 7, the CE requests FE1 to join a multicast group. Although not shown as a separate figure, if FE2 were to join the same group, the join procedure would be the same as in Figure 7; it would result in the multicast group membership being updated at the TML layer on the CE to include FE2 in the group. In Figure 8, the CE requests FE1 to leave the multicast group, thus resulting in only FE2 being a member of the multicast group. Multicast Scenario with FE1 joining group: New group created Maloy, et al. Expires December 3, 2005 [Page 23] Internet-Draft TIPC June 2005 FE1 PL FE1 TML CE TML CE PL | | | | | | | | \ | MC Grp Join Req (McId) | | |<--------------|---------------|---------------| | CE:PL Level multicast group [TML | tmlJoin(McId) | | | | join request sent to each updates |-------------->| | | | FE:PL that needs to be part MC grp | McId = {FE1_ChDes} | | > of a multicast group, McId, info] | | | | | where McId specifies a | <-- status | | | | multicast group Id at the | | | | | PL layer. | MC Grp Join Rsp (status) | | |---------------|---------------|-------------->| / | | | | | | | | \ | | |tmlJoin(McId) | | TML updates multicast | | |<--------------| | group membership. PL is | | McId = {FE1_ChDes} | > only aware of | | | | | | | | \ | | |tmlJoin(McId) | | TML updates multicast | | |<--------------| | group membership. PL is | | McId = {FE1_ChDes} | > only aware of PL layer | | | | | multicast group Id, that is, | | | status --> | | McId] | | | | / Figure 9: FE Joining Multicast Group Multicast Scenario with FE1 leaving group: Group membership updated to exclude FE1 Maloy, et al. Expires December 3, 2005 [Page 24] Internet-Draft TIPC June 2005 FE1 PL FE1 TML CE TML CE PL | | | | | | | | \ | MC Grp Leave Req (McId, FE1) | | |<--------------|-------------------|---------------| | CE:PL Level multicast group [TML | tmlLeave(McId)| | | | leave request sent to FE1:PL removes |-------------->| | | | that needs to be removed MC grp | McId = {} | | > from multicast group, McId, info] | | | | | where McId specifies a | <-- status | | | | multicast group Id at the | | | | | PL layer. | MC Grp Leave Rsp (status) | | |---------------|-------------------|-------------->| / | | | | | | | | | | | | | tmlLeave(McId)| | | \TML removes FE1 from | | |<--------------| | multicast group McId. | | McId = {FE2_ChDes} | > That leaves only FE2 | | | | | in the group. | | | status --> | | | | | | / Figure 10: FE Leaving Multicast Group 3.1.16 Broadcast Model 3.1.17 Security Considerations If the CE or FE are in a single box and network operator is running under a secured environment TIPC can be run over raw Ethernet, without any security mechanisms activated. When the CEs, FEs are running over IP networks or in an insecure environment, we don't recommend use of TIPC for now. 3.2 IANA Considerations 3.3 Manageability TBD: What needs to be added here ? 4. References [ForCES] Doria et al., A., "ForCES Protocol Specification", September 2004, . [RFC2026] Bradner, S., "The Internet Standards Process -- Revision 3", RFC 2026, BCP 9, October 1996, . [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- Hashing for Message Authentication", RFC 2104, February 1997, . [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2406] Kent, S. and R. Atkinson, "IP Encapsulating Security Payload (ESP)", RFC 2406, November 1998, . [RFC2408] Maughan, D., Schertler, M., Schneider, M., and J. Turner, "Internet Security Association and Key Management Protocol", RFC 2408, November 1998, . [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", RFC 2434, BCP 26, October 1998, . [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998, . [RFC2581] Allman, M., Paxson, V., and W. Stevens, "TCP Congestion Control", RFC 2581, April 1999, . [RFC2960] Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer, H., Taylor, T., Rytina, I., Kalla, M., Zhang, L., and V. Paxson, "Stream Control Transmission Protocol", RFC 2960, October 2000, . [RFC3654] Khosravi, H., "Requirements for Separation of IP Control and Forwarding", RFC 2026, BCP 9, November 2003, . [RFC3746] Yang, L., "Forwarding and Control Element Separation (ForCES) Framework", RFC 2026, BCP 9, April 2004, . Maloy, et al. Expires December 3, 2005 [Page 26] Internet-Draft TIPC June 2005 [RFC768] Postel, J., "User Datagram Protocol", RFC 768, STD 6, August 1980, . [RFC793] Postel, J., "Transmission Control Protocol", RFC 793, STD 7, September 1981, . [TIPC] Maloy, J., "Transparent Inter Process Communication", April 2004, . [TMLAPI] Salim et al., J., "ForCES Transport Mapping Layer (TML) Service Primitives and Encapsulations, draft-jhs-forces-tmlapi-00.txt, work in progress", April 2005. Authors' Addresses Jon Paul Maloy Ericsson Research Canada 8400, boul. Decarie Ville Mont-Royal, Quebec H4P 2N2 Canada Phone: +1 514 576-2150 Email: jon.maloy@ericsson.com Jamal Hadi Salim Znyx 195 Staford Road West, Suite 104 Nepean, ON K2H 9C1 Canada Phone: +1 613 596-1138 Email: hadi@znyx.com Maloy, et al. Expires December 3, 2005 [Page 27] Internet-Draft TIPC June 2005 Hormuzd M. Khosravi Intel 2111 NE 25th Avenue, Hillsboro, OR 97124 USA Phone: +1 503 264-0334 Email: hormuzd.m.khosravi@intel.com Furquan Ansari Lucent 101 Crawford Corner Road, Holmdel, NJ 07733 USA Phone: +1 732 949-5249 Email: furquan@lucent.com Chawla Suchi Intel 2111 NE 25th Avenue, Hillsboro, OR 97124 USA Phone: +1 503 712-4539 Email: suchi.chawla@intel.com Maloy, et al. Expires December 3, 2005 [Page 28] Internet-Draft TIPC June 2005 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Maloy, et al. Expires December 3, 2005 [Page 29]