INTERNET-DRAFT Michael Smirnov GMD FOKUS Expires April, 02, 1997 September 28, 1996 EARTH - EAsy IP multicast Routing THrough ATM clouds Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress." To learn the current status of any Internet-Draft, please check the "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Abstract The EARTH could be positioned between MARS [1] and VENUS [2]. This document describes a solution simplifying distribution of IP multicast flows over ATM clouds with the use of point-to-multipoint connections. The EARTH solution includes: IP multicast addresses (Class D) resolution to ATM addresses; support for a multicast group management and receiver-initiated quality of service (QoS) specification; multiple LISs sharing the same physical ATM network. Similarly to separation of IP addresses to unicast and multicast classes, the EARTH proposal separates logical IP subnets as an option to speed up implementation. EARTH differs from MARS: address resolution is made only for IP Class D addresses; multiple LISs could be served by a single EARTH server; EARTH server belongs to a `multicast LIS'. On contrary to VENUS proposal, EARTH simplifies bypassing Mrouters and requires no coordination of multiple MARSs. This document, proposing a solution not yet implemented completely, is intended to help focus ION efforts. 1. Introduction Connection oriented ATM technology is known to have a large spectrum of differences from that of connectionless IP [3]; address resolution problem being one of the most serious. Moreover, "the service provided by the MARS and MARS Clients are almost orthogonal to the IP unicast service over ATM" [1]. This was the motivation to separate IP unicast from IP multicast addresses resolution to ATM addresses. IP unicast ARP is supported fairly well in those ATM products which conform to the Classical IP over ATM [4]. Architecture described in [4] fits the need of unicast but provides Smirnov [Page 1] ^L Internet Draft September, 07, 1996 strong resistance to deployment of IP multicast service model [5] over ATM clouds. This resistance is due to a restriction for LISs to interwork over routers only, even if these LISs share the same physical ATM network [4]. In case of unicast communication interworking between different LISs with the use of routers makes no difficulties; ATM connection establishment is supported by ATM ARP servers - one server per LIS. The content of unicast ARP table is quasi permanent. On contrary, the multicast ARP table's content (e.g. for MARS proposal) is dynamic and follows receiver initiated membership registration messages for particular multicast group[s]. Bypassing routers for inter LIS multicast requires coordination of multiple MARSs along with propagation of join and leave messages to all MARS clients [2] which can congest the network. However, in case of IP multicast flow distribution over ATM cloud the "cut through" functionality - bypassing routers - is highly desired. The cut through will help to achieve better performance in terms of join/leave latency and also provides more flexibility in selecting different QoS levels based on receiver's preferences (section 4). The evolving IP Multicast over ATM solution (the `MARS model' [1]) retains the Classical IP over ATM model and treats LIS as a `MARS Cluster'. Clusters are interconnected by conventional IP Multicast routers (Mrouters). An attempt to achieve LIS boundary cut-through, keeping at the same time the Classical IP over ATM model for both unicast and multicast communications, has been undertaken in the VENUS model [2]. The [2] ends with a conclusion that the approach is too complex to bring reasonable benefits. Instead, "developing solutions for large clusters using a distributed MARS, and single clusters spanning multiple Logical IP Subnets, is a much better focus for ION's work" [2]. This document describes such a solution, which is dabbed `EARTH - EAsy IP multicast Routing THrough ATM clouds'. The EARTH has following components described below in a more detailed way: - Multicast Logical IP Subnet (MLIS) `spanning' the whole physical ATM network; - EARTH server providing IP Class D address resolution to ATM hardware addresses; - support for a [limited] number of QoS levels for various receivers of the multicast flow; - support for IDMR protocols in case of crossings of IP:ATM boundary. The scope of this document is an `ATM cloud' - a nickname for physically connected private ATM network with a uniform management and signalling throughout the cloud. The hosts are ATM endpoints with IP addresses. The ATM interworking with other IP networks is provided by a [set of] egress IP router[s] supporting IP multicast. Each egress IP router is an ATM endpoint of the ATM cloud and has a management assignment to operate as a designated Mrouter for a [set of] particular LISs of the ATM cloud.ATM cloud is supposed to be able to support point-to-multipoint connections controlled by the root endpoint via signalling. Smirnov [Page 2] ^L Internet Draft September, 07, 1996 The Classical IP over ATM retains because EARTH makes no changes to IP unicast over ATM. Therefore EARTH implementation will not disturb any on-going operation of Classical LISs over the ATM cloud. 2. Multicast Logical IP Subnet Throughout this document a Logical IP Subnet which conforms to [4] will be called Classical LIS. Multicast LIS is a single LIS per a physical ATM network which has no particular hosts or routers being its permanent members. MLIS is shared by all Classical LISs in cases of IP multicast flow distribution with the use of point-to- multipoint ATM connections. The MLIS concept implies that each endpoint should be configured with an IP address for the MLIS. MLIS could be configured as a private IP network because its scope is address resolution over a single ATM cloud. Note that IP multicast over ATM with the use of mesh of point-to- point connections is still possible with this proposal and does not affect MLIS or EARTH server operation. However, this proposal recognizes that the mesh could be a feasible solution for a multicast within one LIS and a `small' multicast group size. IP multicast was enabled with the utilisation of IP address space separation. Multicast addresses are borrowed from the previously reserved Class D address space [5,6]. During a multicast session each participating receiver has at least two IP addresses: one, permanent, for regular unicast communication, and, second, temporary, for a multicast communication. However, in this situation a receiver is not considered to be a multihomed host because all its multicast communications are controlled/enabled by a multicast router (Mrouter) designated for a particular subnet. (This could be called `indirect multihoming'.) Mrouter runs IGMP to make forwarding decision for a particular multicast flow to the [broadcast] subnet and also runs IDMR protocols with other Mrouters for a global distribution of the multicast flow. This architecture fails in case of NBMA subnets at the `bottom' of multicast distribution tree. The EARTH solution proposes `direct multihoming' as an option to enable IP multicast over ATM. (Note, that proposed multihoming above refers only to the interaction of endpoints with the EARTH server.) In EARTH scenario during a multicast session each participating receiver has at least three IP addresses: one, permanent, for regular unicast communication, second, temporary, Class D, for a multicast communication, and third, private, - for IP multicast address resolution. Unicast IP address is resolved to ATM address, as usual, with the use of Classical ATM ARP. Multicast IP address is resolved to a set of ATM addresses with the use of MLIS. The EARTH scenario implies two options of host extensions. The first host option requires that each ATM endpoint should be configured to work with one additional LIS - MLIS, and supplied with the ATM address of EARTH server (see section 3). That is, the EARTH server could be seen as the ATM ARP server for MLIS (with two notes: MLIS is shared by all endpoints from all Classical LISs, and ATM ARP Smirnov [Page 3] ^L Internet Draft September, 07, 1996 considers here only IP Class D addresses). The second host extension option requires that ATM adapter driver should be able to distinguish between unicast and multicast addresses. If it needs to resolve the unicast IP address it should contact the regular ATM ARP server; in case of multicast it should contact the EARTH server. The MLIS scenario makes some assumptions about Mrouters, being directly connected to the ATM cloud. Each Mrouter is supposed to know the ATM address of the EARTH server. Each Mrouter has to be able to distinguish between unicast LISs, i.e. to know which LIS[s] it is designated for. That is, if the first sender to a multicast group belongs to, say, LIS_1, and Mrouter Rtr_A is the designated Mrouter for LIS_1, then Rtr_A is the only gateway for the multicast flow to go out of the ATM cloud, and, therefore, to get in. More details could be found in Section 5. 3. EARTH server behaviour The EARTH server makes IP Class D to ATM addresses resolution of those ATM endpoints which have registered with the EARTH server for a particular multicast session as receivers. The information exchange follows the principles of ARP found in [7], applied for NBMA. A potential receiver of IP multicast flow, whatever unicast LIS it is located, sends its registration message to the EARTH server.The message contains receiver's ATM address, receivers IP address, receiver LIS's subnet mask, target multicast address, and, optionally, a QoS level at which the receiver wants to have this multicast flow. The EARTH server keeps these membership information in a multicast address resolution table (MART) in a form of a list (member_list) of following entries: member_list_entry: = . Member_list is a heap of member_list_entry elements. One member_list is per each active IP multicast address. Activity of IP multicast address is defined by the EARTH server with appropriate ageing functionality. The ageing decision for a particular member_list depends on communication lifetime between the EARTH server and both senders and receivers. One member_list_entry is kept for each member of the multicast group. Following IP multicast model, a sender to the group needs not to be a member of the group. However, the EARTH server treats egress Mrouter, being current [re]sender, as potential receiver for the next sender to the same group (Section 5 provides this algorithm). In a connection-oriented ATM cloud new senders to an active IP address will need to establish their own point-to-multipoint Smirnov [Page 4] ^L Internet Draft September, 07, 1996 connection to the group members. Note: Phase 1 of UNI 3.1. specification supports only zero return - from Leaves to Root - bandwidth [8, p. 154]. A sender to the multicast group queries the EARTH server periodically to get the membership information and to use it for its point-to-multipoint connection management.The member_list is sent by the EARTH server back to the querying endpoint (sender or Mrouter) as the answer to its EARTH request. An endpoint - sender is supposed to keep its local cache of the member_list for comparison and deriving needed changes to the connection. A case of multiple senders to the same group in EARTH scenario should be protected from chaotic crossings of the ATM cloud's boundary. Suppose each sender to the same group will use that Mrouter which is a designated Mrouter for the sender's LIS to advertise/forward its traffic to receivers outside of the ATM cloud. In case of multiple egress Mrouters this will confuse the IDMR protocols and disbalance the multicast distribution tree. For needed protection EARTH server implements a single gateway principle (see section 5). It should be mentioned that the EARTH server has absolutely passive behaviour: it simply keeps the registration information and answers queries. Conceptually, with regard to IGMP, the EARTH server represents a broadcast media collapsed to the size of a single point. When, and if, the size of the ATM cloud will require replications of this point, then, the anycast capability (to contact this distributed EARTH service) could be employed. 4. Support for various QoS levels The EARTH server optionally can provide a sender to a multicast group with a classified membership information. The member_list could be partitioned into several subgroups reflecting receivers' preferences of the QoS levels. These QoS levels should be negotiated with the ATM network administration and should be known to all potential receivers. The classified member list (member_list_qos) is a collection of member_lists with equal QoS levels. It is assumed that each QoS level is supported with a separate point-to-multipoint connection (ptm_qs); each ptm_qs starts at the switch under the control provided by sender via signalling. Upon receiving a member_list_qos a sender updates/creates each ptm_qs separately, if needed. 5. Support for IDMR in case of cut through The EARTH proposal tries to protect IDMR protocols outside the scope of the ATM cloud from being confused by multiple entry/exit points at the boundary of the ATM cloud for the same IP multicast group. This protection is via a single gateway principle implementation. From a Mrouter's viewpoint the single gateway keeps the whole ATM cloud as a single subnet, whatever number of LISs it has. However, Smirnov [Page 5] ^L Internet Draft September, 07, 1996 this is only true for a single IP multicast group, not for a collection of multicast groups - each group can have various gateways. Note, that for unicast communications the Classical IP over ATM still apply, with partitioning of endpoints to a number of LISs and with a set of routers designated to these LISs. A `Single Gateway' principle says that the ATM cloud, whatever LISs and routers it has, should have a single gateway (router) forwarding IP multicast packets to and from the ATM cloud for each multicast session. Suggested implementation is as follows. The single gateway router is determined either by its designation to the LIS where the multicast flow first originated or by the fact of forwarding first multicast flow to the ATM cloud. For both cases above the `first' denotes that the single gateway Mrouter is determined by the first sender to the active IP multicast group. These cases distinguish between the inside-out and outside-in propagations of the multicast traffic with regard to the ATM cloud boundaries. The first case uses internal partition of the ATM cloud into LISs with designated Mrouters. The second case treats the ATM cloud as a single subnet (without distinctions between LISs). For example, let us consider an IP multicast session over the ATM cloud with 4 LISs numbered 1 through 4 and 3 Mrouters - A, B, C - designated, correspondingly, Rtr_A to LIS_1 and LIS_3, Rtr_B to LIS_2 and Rtr_C to LIS_4. If the first sender to a group is external and its flow reaches the ATM cloud via Rtr_B then Rtr_B will retain the single gateway for the lifetime of the session (depending on the ageing decision of the EARTH server). If the first sender to a group is internal with regard to the ATM cloud and comes from LIS_3, then Rtr_A will retain the gateway for the lifetime of the session, even if other endpoints will start sending to the group later from LIS_2 or LIS_4. How this information is kept by EARTH server? According to section 3 the EARTH server is queried by senders periodically to get updates about the group membership. We require that EARTH: - [quasi] permanently keeps a list of ATM addresses of all egress routers to the ATM cloud and their designation to LISs (egress_list); - temporarily keeps ATM address of the current gateway (gateway_atm) for a particular multicast group (for the session's lifetime), gateway_atm is linked to a member_list; - gets sender's ATM address (query_atm) from the query. Session's lifetime is defined by EARTH with the use of ageing functionality. After the group membership is dismissed the gateway_atm is zeroed. When the first query arrives for a particular group (i.e. gateway_atm is equal to zero), EARTH compares the query_atm with the content of egress_list, and, if finds the coincidence (that is the first sender to the group is one of the egress Mrouters), it Smirnov [Page 6] ^L Internet Draft September, 07, 1996 stores this egress router's ATM address in the gateway_atm linked to the member_list and replies with the current member_list, else (that is the first sender is a host) automatically adds to the member_list the ATM address of the egress router designated for the sender's LIS, stores this router's address in the gateway_atm and replies with the modified member_list. With the arrival of any next query for this group the EARTH server compares query_atm with the content of the gateway_atm. If the result is positive (i.e., the current query is the refresh query from the first sender) then the server replies with the current member_list and makes no changes to it. Otherwise (i.e. it's a query from a new sender) the server compares query_atm with the content of egress_list and, if finds coincidence (i.e. a new sender is also an Mrouter) then replies with the gateway_atm (readdressing external flow to the group to existing gateway), else (a new sender is internal endpoint) replies to this new sender with a modified member_list: it adds the gateway_atm to the member_list. Comment: the term `sender' elsewhere in the document should be treated as a `sender or potential sender'; if the Mrouter's query results in a zero member_list then no forwarding of the flow takes place. 6. Applicability Notes (Future Research) Simulation study of EARTH performance gives promising results. Though, a number of open issues requires future research. For example, the case of ptm_qs including egress Mrouter, interworking with public ATM networks, use of IP switches, influences of IDMR protocols and RSVP should be studied more carefully. 7. Scalability In large physical ATM network a single EARTH server could become a bottleneck. In case of ATM Forum UNI 4.0 implementation this problem could be readdressed to the ILMI which is supposed to support group ATM addresses. That is, EARTH server may resolve IP multicast address to ATM group address while changes to the group within the ATM network will be managed by ILMI. Currently two other solutions could be proposed for future research: server synchronisation and server hierarchy. 9. Security Considerations This document raises no security issues. However, a single gateway principle could be helpful in controlling access to the ATM cloud. 8. Acknowledgements Acknowledgements to ion (ipatm) mailing list. 11. References [1] Armitage, G., Support for Multicast over UNI 3.0/3.1 based ATM Smirnov [Page 7] ^L Internet Draft September, 07, 1996 Networks, Internet-Draft , February, 1995 [2] Armitage, G., VENUS - Very Extensive Non-Unicast Service, Internet-Draft , July, 1996 [3] Cole, R.G., Shur, D.H., Villamizar, C. IP over ATM: A Framework Document Internet-Draft , October, 1995 [4] Laubach, M., Classical IP and ARP over ATM, Network Working Group, Request for Comments: 1577, Category: Standards Track, January, 1994 [5] Deering, S., Host Extensions for IP Multicasting, IETF, RFC 1112, August, 1989 [6] Deering, S., "Multicast Routing in a Datagram Internetwork", Ph.D. Thesis, Stanford University, December, 1991. [7] Postel, J., and J. Reynolds, "A Standard for the Transmission of IP Datagrams over IEEE 802 Networks", STD 43, RFC 1042, USC/ Information Sciences Institute, February 1988. [8] ATM User-Network Interface Specification, The ATM Forum, v.3.1, September, 1994 12. Author's Address Michael Smirnov GMD FOKUS Hardenbergplatz 2, Berlin 10623 Phone: +49 30 25499113 Fax: +49 30 25499202 EMail: smirnow@fokus.gmd.de Smirnov [Page 8] ^L Internet Draft September, 07, 1996