INTERNET-DRAFT Jiwoong Lee Expires: March 2002 KTF September 3 2001 Explicit Multicast over Ethernet Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsolete by other documents at anytime. It is inappropriate to use Internet Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. All remarks may be forwarded to author's email address or Xcast IG(Incuabtion Group) mailing list: xcast@public.alcatel.com CHANGES This section briefly lists some of the major changes in this draft relative to the previous version of this same draft, draft-lee- xcast-ethernet-00.txt - Changed the title of the document. - Detailed the introduction section. - Allowed single packet transmission in case that plural desti- nation IP addresses are associated with a single Ethernet address. - Added new discussion on Proxy ARP and Proxy Neighbor Adver- tisements in Appendix. Jiwoong Lee Expire Mar 2002 [Page 1] INTERNET-DRAFT Xcast over Ethernet Sep 2001 - Changed expressions slightly. 1. Introduction Explicit multicast(Xcast)[XCST] is a new kind of Internet multicast and complements the Host Group Model multicast in the sense of multicast benefits. Xcast encodes addresses of destinations within its datagram and requires neither any routing information exchange nor state management. Xcast routing utilizes legacy unicast routing information managed at the node and is expected to have performance good enough for small size multicast groups. Since Xcast is originally designed on the assumption that delivery path is made of point-to-point style links, it shows erroneous delivery problem over conventional broadcast media including Ethernet. This document specifies how to deliver an Xcast datagram over Ethernet, eliminating erroneous delivery problem. The problem and the candidate solutions are explained in detail in Appendix. 1.1 Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. In addition, this document frequently uses the following terms: Ethernet A link layer technology including CSMA/CD and full-duplex subnetworks based on ISO/IEC 8802-3, with various data rates Xcast Explicit Multicast X2M Conversion of Xcast to Host Group Model Multicast X2U Conversion of Xcast to Unicast Xcast datagram IP datagram which carries legitimate Xcast header structure The definitions of terms that are not defined here can be found at references at the end. 2. Address mapping The destination address of an Xcast datagram is a predefined link local multicast address known as All_Xcast_Nodes [XCST]. When an Jiwoong Lee Expire Mar 2002 [Page 2] INTERNET-DRAFT Xcast over Ethernet Sep 2001 Xcast node maps All_Xcast_Nodes into an Ethernet link layer address, it does NOT follow the address mapping rule defined in [1112][2464][0894]. That is, All_Xcast_Nodes SHOULD NOT be mapped into a multicast Ethernet address. Instead, the target link layer address MUST be the designated Ethernet address of the next routing hop on the Ethernet link that the source node resides. 3. Transmitting an Xcast datagram over Ethernet If a next hop toward which an Xcast router sends a processed Xcast datagram resides in an Ethernet link that the router shares, the Xcast router can send the datagram in one of two ways. 3.1 By using unicast Ethernet address The target link layer address MUST be the designated Ethernet address of the next hop. Note that Xcast datagram itself does not include any information addressing the next hop. Therefore the Xcast routing module should have some way to transfer the appropriate information to the link layer module so that it can acquire the link layer address of the next hop and transmit the Ethernet frame to that link layer address. If next hops which have differnt IP addresses but share a single Ethernet address, Xcast router MAY transmit a single Xcast datagram to a next hop. In this case, 'next hops' are distinguished not only by IP address but also Ethernet address. This situation occurs when proxy ARP messages[MLAN] or proxy Neighbor Advertisements[NADV] are used. See Appendix C for further discussion. 3.2 By using IP encapsulation Before the Xcast router sends a processed Xcast datagram, it MAY perform IP-in-IP encapsulation [IPIP] [TUN6] over it. The destination address of the tunnel header MUST be the unicast IP address of the next hop. After the encapsulation, the Xcast router handles the datagram as normal outgoing unicast one. 3.3 X2M When an Xcast router performs X2M over an Xcast datagram, it produces no more Xcast datagram but Host Group Model multicast datagram. Handling Host Group Model multicast datagram does not belong to this document. 4. Receiving an Xcast datagram over Ethernet If an Xcast node receives an Xcast datagram encapsulated by the Ethernet frame which is addressed to multicast Ethernet address, it MUST discard this datagram. From the viewpoint of implementations, Jiwoong Lee Expire Mar 2002 [Page 3] INTERNET-DRAFT Xcast over Ethernet Sep 2001 the subject that discards incoming Xcast datagram is the network layer module of the implementations, rather than link layer module. The network module of the implementation will successfully discard this Xcast packet because there SHOULD be no entry for the predefined multicast address, All_Xcast_Routers, in its Multicast Forwarding Information Base of any node. If an Xcast node receives an Xcast datagram encalsulated by the Ethernet frame which is addressed to unicast Ethernet address, normal Xcast operations SHALL be performed as descriebed in [XCST]. 5. Security Considerations The addresses of the recipients carried in an Xcast datagram can be easily explosed to malicious attackers who are connected to the Ethernet links on the delivery path. There are no known means to conceal the recipient information. As happens in generic transmission of Internet datagram, a mali- cious node can send false proxy ARP messages or false Neighbor Advertisements, which result in mistaken dilivery of Xcast packets. Appendix In this section, the rationale for section 2 & 3 will be given. Section A describes the routing problem when applied to Ethernet link without the modification of specialized delivery mechanism. Section B lists up the imaginable solutions to the problem in A and challenges and conquers each of them. Section C discusses the effect of proxy ARP/proxy Neighbor Advertisements when the protocol of this document is applied. A. Discussion The current version of Xcast basic specification [XCST] assumes that all links en route are point-to-point ones. However, the general Internet architecture does not consist only of point-to- point links, but consists also of shared links. In particular Ethernet has acquired its popularity and has been widespreadly deployed as WAN technology as well as LAN. Some Ethernet technology including only two nodes in its segment will not be different from the normal point-to-point link. One example of this kind of configuration is 10 Gigabit Ethernet. This document does not deal with following Xcast talk items; - Xcast-awareness test of a node on the shared link Jiwoong Lee Expire Mar 2002 [Page 4] INTERNET-DRAFT Xcast over Ethernet Sep 2001 - How to elect a designated Xcast router - Interoperability between Xcast and Host Group Model multicast A.1 Why we need to newly invent how to transmit over Ethernet Assume Xcast basic specification [XCST] is applied to Ethernet link of (Figure A-1) while conventional kernel supports only the Host Group Model multicast. Sender creates an Xcast datagram whose List of Addresses includes the addresses of E, F and G. When A receives the datagram, it performs Xcast routing as specified in [XCST] and generates two Xcast datagram #1 and #2. #1 is for E and F, #2 is for G, if no premature X2U was applied. Both are addressed to All_Xcast_Nodes. Since they will be encapsulated by multicast addressed Ethernet frames, they reach B and C respectively. Note that Xcast routers does not perform any multicast tree construction or status management. When B receives #2, B does not perform Reverse Path Forwarding check and believes that this datagram should be forwarded to C. Therefore B will toss #2 to C over Link X. The 'tossed #2' will be heard by A also and A's routing module will start to process it since A cannot distinguish 'tossed #2' and '#2' that it previously sent. A sends another #2 to Link X. B listens 'another #2', and so on. Therefore the Link X will be flooded with #1, #2 and their descendents until their lifetime expires. Duplicatively received datagrams will not be discarded en route (including even end hosts) if the redundancy check is not applied. For this reason, a new solution must be provided to deliver Xcast datagram over Ethernet. Link X /------E |---B------D------F #1 | Sender-----A-------| #2 | |---C------G (Figure A-1) Example network topology B. Solution candidates Jiwoong Lee Expire Mar 2002 [Page 5] INTERNET-DRAFT Xcast over Ethernet Sep 2001 It has been deemed that there are various solution candidates to realize the error-free delivery of Xcast datagram over Ethernet link. They may work logically in a technical sense. They are a) Building a Multiplexed form of Xcast datagram, abiding by RFC1112 rule. b) Transmitting a Xcast datagram with one-hop tunneling. The tunnel header is unicast-addressed to the next hop. c) Building an variant Ethernet frame which carries multiple target link layer addresses. d) Utilizing IEEE 802.1Q VLAN tag header to indicate legitimate next hops e) Transmitting an Xcast datagram in unicast Ethernet frame, which is destined to the next hop. B.1 Challenge and conquer a) Building the Multiplexed form of Xcast datagram, abiding by [1112] rule. On an Ethernet link, the source Xcast router does not perform replicating the datagram per next hop. Rather, it inserts some field into the Xcast datagram to indicate who are the legitimate next hops and which receiver addresses they should process. One convenient form of this particular datagram will be [ S:source, D:All_Xcast_Nodes, NextHops: N1 for L1 / N2 for L2, List:L1+L2 ] , where S stands for Source address field, D for Destination address field, NextHops for an extended version of List of Addresses field, N# for an address of a next hop, L# for a list of addresses that N# should process. The link layer module maps the destination address, All_Xcast_Nodes, into a multicast link layer address as described in [1112], and transmits this frame over Ethernet. The advantage of this form is that the Xcast can obtain the high bandwidth utiliza- tion at least as much as the Host Group Model multicast can do over an Ethernet link. One obvious obstacle is that this requires the modifications in encoding scheme of Xcast since it inserts a new kind of field: NextHops. The worse one is the fact that the length of this new field can vary hop by hop. In IPv6, one 'next hop' requires 16 bytes for the address of next hop and at least one more byte for list index. Hence increased overhead. For example if there are four next hops in one Ethernet link, the required additional overhead will be ( 16 + 1 ) * 4 = 68 bytes. Jiwoong Lee Expire Mar 2002 [Page 6] INTERNET-DRAFT Xcast over Ethernet Sep 2001 b) Transmitting the Xcast datagram with one-hop tunneling. The tunnel header is unicast-addressed to the next hop. This seems the easiest solution to implement. The Ethernet modules of Xcast nodes does not have to be modified at all. One disadvan- tage is multicast bandwidth gain with this scheme is inferior to that of the Host Group Model multicast, since the link sees Ether- net frame as many as the number of the next hops. (Note, however, the number of next hops can be much smaller than that of recipi- ents. Therefore the bandwdith gain of this scheme is not same to that of unicasting.) Another possible drawback is the performance degradation at Xcast routers due to the repeated IP encapsulation / decapsulation. c) Building an variant Ethernet frame which carries multiple targe link layer addresses. There are at least (or approximately) 900 independent Ethernet ven- dors in this world[VEND] and this technology has been deployed since 1980. Ethernet is already too widespread to make some modifi- cation over it. d) Utilizing IEEE 802.1Q VLAN tag header to indicate legitimate next hops In I998 IEEE created 4-byte VLAN tag header and added it prior to the Length/Type field of the Ethernet frame. Since only 4 bytes are available, it is not possible to include any Ethernet address address in this field. Rather this field can be used as an indica- tion bitmap posterior to link-layer level signaling that shares information who are the Xcast node in that Ethernet segment. Note that this still requires another kind of signaling between Ethernet nodes and addtional modifications of IEEE stanards are required. e) Transmitting an Xcast datagram in unicast Ethernet frame, which is destined to the next hop. Since the Ethernet frame is destined to a unicast address, the frame will be delivered to the intended destination without erro- neous frame congestion. Address mapping process belongs to link layer, it is performed after Xcast routing process defined in [XCST] at transmissions. The advantages of this scheme are that it requires no extra signaling or state management. One major obsta- cle is that it is not possible the link layer module deduces the link layer address of the next hop from the network layer datagram, because every Xcast datagram carries an link-local multicast address, All_Xcast_Nodes, in its destination field and carries no information regarding to the next hop in it. This implies that an implementation of 'Xcast over Ethernet' should have some way to deliver the next hop information from the network layer module to link layer module. Jiwoong Lee Expire Mar 2002 [Page 7] INTERNET-DRAFT Xcast over Ethernet Sep 2001 C. Proxy ARP and Proxy Neighbor Advertisement (Figure C-1) shows an example of last-mile network. Assume Sender sends an Xcast packet addressed to two IP addresses, MN1 and MN2. An Xcast router XR, which is the gateway router of Link Y, makes two copies of incoming Xcast datagrams, generating Ethernet frames #1 and #2. #1 is addressed to _MN1, which is the Ethernet address of MN1. #2 is addressed to _MN2, which is the Ethernet address of MN2. Ethernet frames carries the Xcast datagrams addressed to MN1 and MN2 respectively. Link Y |---HA | Sender-----XR------| #1 #2 |---MN1 |---MN2 (Figure C-1) While mobile nodes are at home If MN1 and MN2 are mobile nodes, they can leave its home link Y and register themselves with their home agent, HA (Figure C-2). On behalf of MN1 and MN2, HA becomes a proxy node that receives incom- ing packets addressed to MN1 and MN2. To this end, HA sends proxy ARP messages in IPv4 or proxy Neighbor Advertisements in IPv6 to link Y. On response to this link layer information update message, XR correlates MN1 and MN2 with _HA, the Ethernet address of HA. When XR receives an Xcast datgram addressed to MN1 and MN2 from Sender, it generates two Xcast datagrams and encapsulate them in each Ethernet frame addressed to _HA. Even though the real next hop on delivery path is only one, the routing module of XR generates Xcast datagrams as many as the number of the nodes whom HA acts as proxy for. Link Y |---HA #1 #2 | Sender-----XR------| | | Jiwoong Lee Expire Mar 2002 [Page 8] INTERNET-DRAFT Xcast over Ethernet Sep 2001 (Figure C-2) While mobile nodes are away home This is believed to be fine since the number of datagrams XR sends into link Y while mobile nodes are away home are equal to that while mobile nodes are at home. NOTE: If HA is capable of Xcast routing , the preferable scenario is that XR transmits a single Xcast datagram to HA rather than multiple ones because they have the same 'next hop'. However, doing so requires XR routes the incoming Xcast datagram based on link layer information - the mappings of _HA from MN1 and MN2. References [XCST] R. Boivie, Y. Imai, W. Livens, D. Ooms, and O. Paridaens, Explicit Multicast Basic Specification, IETF draft-ooms-xcast-basic- spec-01.txt, March 2001 [1112] S. Deering, Host Extensions for IP Multicasting, IETF RFC 1112, August 1989 [TUN6] S. Deering, Generic Packet Tunneling in IPv6 Specification, IETF RFC 2473, December 1998 [IPIP] C. Perkins, IP Encapsulation within IP, IETF RFC 2003, October 1996 [ETHR] C. Spurgeon, Ethernet: The Definitive Guide, O'Reilly & Associates, Inc., February 2000 [0894] C. Horing, A Standard for the Transmission of IP datagrams over Ethernet Networks, IETF RFC 894, Symbolics Cambridge Research Cen- ter, April 1984 [0826] D. Plummer, An Ethernet Address Resolution Protocol, IETF RFC 826, Symbolics Cambridge Research Center, November 1982 Jiwoong Lee Expire Mar 2002 [Page 9] INTERNET-DRAFT Xcast over Ethernet Sep 2001 [2464] M. Crawford, Transmission of IPv6 Packets over Ethernet Networks, IETF RFC 2464, December 1998 [VLAN] IEEE Std 802.1Q-1998 Virtual Bridged Local Area Networks [MLAN] J. Postel. Multi-LAN address resolution. IETF RFC 925, October 1984 [NADV] Thomas Narten, Erik Nordmark, and William Allen Simpson. Neighbor Discovery for IP Version 6 (IPv6). RFC 2461, December 1998. [VEND] http://standards.ieee.org/regauth/oui/oui.txt Author Address Jiwoong Lee KTF Advanced Lab 1321-11 Seocho-Dong Seocho-Ku Seoul Korea, Republic of Phone : +82-2-3488-0416 Email : porce@ktf.com Jiwoong Lee Expire Mar 2002 [Page 10]