INTERNET-DRAFT "Internet Protocol Five Fields - Link Address Resolution Algorithm", Alexey Eromenko, 2016-02-07, expiration date: 2016-08-07 Intended status: Experimental A.Eromenko February 2016 Link Address Resolution Algorithm for Internet Protocol "Five Fields" ===================================== Specification Draft Abstract This document describes how-to correlate between IEEE 802.3 Ethernet addresses and Internet Protocol "Five Fields" (IP-FF) addresses, including unicasts, multicasts and duplicate address detection mechanism, and the procedure to boot-up IPFF stack on Ethernet links. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction 2. Generating Logical MAC addresses 3. Duplicate Address Detection (DAD) 3.1. FastDAD - Unicast Duplicate Address Detection 3.1.1. FastDAD request example 3.1.2. FastDAD reply example 3.1.3. Bonus: Duplicate MAC address detection 3.2. MultiDAD - Multicast Duplicate Address Detection 3.2.1. MultiDAD request example 3.2.2. MultiDAD reply example 3.3 PowerDAD - Broadcast Duplicate Address Detection 3.3.1. PowerDAD request example 3.3.2. PowerDAD reply example 3.4. SleepyDAD - Passive Duplicate Address Detection 4. Limitation: What we do not detect ? 4.1. Other limitation: First Field Overlapping 4.2. Dot1Q VLANs limitation 5. Booting IPFF stack 5.1. Broadcast Advertisement 6. Dual-Stack operation hints 7. Private MAC addresses on Ethernet networks 8. Mapping of Multicast addresses 9. Theoretical Efficiency vs ARP/NDP Authors' Contacts 1. Introduction This document describes specification for a resolution between IPFF addresses and IEEE 802.3 Ethernet addresses, as well as a Duplicate Address Detection protocol. Other data-link layers may require separate specifications. Minimum Prefix: The minimum prefix on Ethernet interfaces is /10. On Ethernet, instead of resolving a logical IP address to a physical MAC address, we simply generate our own logical MAC addresses as a one-way function from logical IP addresses. This "trick" can be done because physical MAC addresses are actually stored in RAM or a re-writable register in the Ethernet controller driver, and therefore changeable (until the next reboot). 2. Generating Logical MAC addresses One-way function to generate MAC address is to put last 40-bits (four fields) of an IPFF address into a logical MAC container, beginning with 0A:xx:xx: xx:xx:xx So let's take an IP address like this: 382.201.769.25.133 IP: 2nd Field - 3rd Field - 4th field - 5th field 00 1100 1001 - 11 0000 0001 - 00 0001 1001 - 00 1000 0101 Logical MAC address: 2nd Octet - 3rd Octet - 4th Octet - 5th Octet - 6th Octet 0011 0010 - 0111 0000 - 0001 0000 - 0110 0100 - 1000 0101 (binary) 0x 32 - 0x 70 - 0x 10 - 0x 64 - 0x 85 (hexa) So our final logical MAC address will look like: 0A:32:70:10:64:85 This is a one-way function. You can convert any IPFF address to a logical MAC, but not the other way around, which is why minimal IP prefix is /10. 3. Duplicate Address Detection (DAD) Fast Duplicate Address Detection (FastDAD) means by Unicast frames. If a request comes with broadcast layer 2 address, so will reply. This technique somewhat similar to Gratuitous ARP, but not exactly. ICMP packet: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4| CRC32 Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 8| Type | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 12| Source MAC address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 16| Code | Reserved | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 20| Requested IP address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 24| | + IPFF Session ID + 28| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ (bytes) IP Fields: HTL = 1 (Hops-to-Live) Source IP Address = Destination IP Address ICMP Fields: Type: (16-bit) (proposed values; to be assigned by IANA) 16 = DAD request 17 = DAD reply Code: (8-bit) 0 = FastDAD (Unicast) 1 = MultiDAD (Multicast) 2 = PowerDAD (Broadcast) In case of a reply, this code must match a code from the request. Source MAC address (48-bit) - Physical MAC of the computer sending this packet. Requested IP address (50-bit) - the IP address a node wants to assign itself. IPFF Session ID (64-bit) - an additional unique identifier to detect duplicate MAC addresses. Randomly generated value during init. Does not change until stack reboot. Unique per host/VRF, not per interface. Reserved - Initialized to zero on transmission; ignored by receiver. 3.1. FastDAD - Unicast Duplicate Address Detection This is the fastest method, that catches only a subset of the DAD problems. It uses Unicast, and is the most CPU and network efficient. 3.1.1. FastDAD request example Assume PC-"A" has a MAC address: B8:6B:23:7B:C3:79 Assume PC-"B" has a MAC address: B8:6B:23:7B:C3:80 Assume PC-"A" wants to get IP 382.201.769.25.133 Ethernet layer: dst.MAC: 0A:32:70:10:64:85 (Unicast to Logical MAC of target IP;) Derived from 382.201.769.25.133 earlier. src.MAC: B8:6B:23:7B:C3:79 (physical MAC of PC-"A") IP layer: src.IP: 382.201.769.25.133 (same as destination) dst.IP: 382.201.769.25.133 HTL = 1, ToS = 0, Protocol = 1 (ICMP) ICMP layer: Type: 16 = DAD request Code: 0 (FastDAD) IP-FF session ID= AAA src.MAC: B8:6B:23:7B:C3:79 (same as on Ethernet layer) Requested IP: 382.201.769.25.133 3.1.2. FastDAD reply example Ethernet layer: dst.MAC: B8:6B:23:7B:C3:79 (physical MAC of PC-"A") src.MAC: B8:6B:23:7B:C3:80 (physical MAC of PC-"B") IP layer: src.IP: 382.201.769.25.133 (same as destination) dst.IP: 382.201.769.25.133 ICMP layer: Type: 17 = DAD reply Code: 0 (FastDAD) IP-FF session ID= BBB src.MAC: B8:6B:23:7B:C3:80 (same as in Ethernet layer) Requested IP: 382.201.769.25.133 (same as in request) Duplicate IP detected ! (no reply = all good) 3.1.3. Bonus: Duplicate MAC address detection This scenario can be a result from careless clone-deployment of virtual machines. If receiving station has the same physical MAC as received DAD request, we compare IPFF session ID. If it is different, we send a reply: FDAD reply #2 Ethernet layer: dst.MAC: FF:FF:FF:FF:FF:FF (reply sent to Layer 2 broadcast, to make sure it traverses ALL switches) src.MAC: B8:6B:23:7B:C3:79 (physical MAC of PC-"C" & "A") IP layer: src.IP: 382.201.769.25.133 (same as destination) dst.IP: 382.201.769.25.133 HTL = 1, ToS = 0, Protocol = 1 (ICMP) ICMP layer: Type: 17 = DAD reply Code: 0 (FastDAD, due to request being FastDAD code 0) IP-FF session ID= CCC src.MAC: B8:6B:23:7B:C3:79 (same as on Ethernet layer) Requested IP: 382.201.769.25.133 (same as in request) Duplicate IP & physical MAC detected ! (no reply = all good) 3.2. MultiDAD - Multicast Duplicate Address Detection A Multicats Duplicate Address Detection (MultiDAD) is similar to FastDAD, but using multicast addresses, with least significant 10-bits (last field) moved to the Multicast MAC address. The request is always sent to 99.9.0.1.x/40; where "x" is the last dectet of an IP address we are probing for... Reply must be answered to the same address. NOTE: This address mapping is somewhat similar to IPv6 "Solicited-node Multicast Address". In IP-FF this is called "Node Multicast Address". It also allows to catch "First Field Overlap" (FFlap), problems, things like 10.168.1.1.1 and 192.168.1.1.1 sitting on the same Broadcast Domain. This is the recommended mode, that will catch most problems, at a good performance. 3.2.1. MultiDAD request example A classic First Field Overlap problem (FFlap) Assume PC-"A" has a MAC address: B8:6B:23:7B:C3:79 and no IP. Assume PC-"B" has a MAC address: B8:6B:23:7B:C3:80 and IP 192.168.1.2.701 Assume PC-"A" wants to get IP 10.168.1.2.701 Ethernet layer: dst.MAC: 55:55:00:00:06:BD (Node Multicast address) src.MAC: B8:6B:23:7B:C3:79 (physical MAC of PC-"A") IP layer: src.IP: 0.0.0.0.0 (Unspecified) dst.IP: 99.9.0.1.701 (Node Multicast address) HTL = 1, ToS = 0, Protocol = 1 (ICMP) ICMP layer: Type: 16 = DAD request Code: 1 (MultiDAD) IP-FF session ID= AAA src.MAC: B8:6B:23:7B:C3:79 (same as on Ethernet layer) Requested IP: 10.168.1.2.701 3.2.2. MultiDAD reply example Ethernet layer: dst.MAC: 55:55:00:00:06:BD (Node Multicast address) src.MAC: B8:6B:23:7B:C3:80 (physical MAC of PC-"B") IP layer: src.IP: 192.168.1.2.701 (IP of PC-"B") dst.IP: 99.9.0.1.701 (Node Multicast address) HTL = 1, ToS = 0, Protocol = 1 (ICMP) ICMP layer: Type: 17 = DAD reply Code: 1 (MultiDAD) IP-FF session ID= BBB src.MAC: B8:6B:23:7B:C3:80 (same as on Ethernet layer) Requested IP: 10.168.1.2.701 (same as in request) Because the 40-least-significant bits match, a reply MUST be sent. First Field Overlap problem detected ! (no reply = all good) 3.3. PowerDAD - Broadcast Duplicate Address Detection A Powerful Duplicate Address Detection (PowerDAD) is similar to MultiDAD, but using broadcast addresses. PowerDAD allows to catch even more problems, compared to MultiDAD; For example, it may catch a problem of a duplicate physical MAC address, even when computers are using two completely different IP-FF addresses, which may be a result of cloned virtual machines. This mode has CPU performance consequences, and therefore not recommended for general use, except for network debugging. 3.3.1. PowerDAD request example A classic First Field Overlap problem (FFlap) Assume virtual-PC-"A" has a MAC address: B8:6B:23:7B:C3:79 and no IP. Assume virtual-PC-"B" has a MAC address: B8:6B:23:7B:C3:79 and IP 192.168.0.0.1; and this is a clone of virtual PC-"A". Assume PC-"A" wants to get IP 192.168.0.1.2 Ethernet layer: dst.MAC: FF:FF:FF:FF:FF:FF (Broadcast) src.MAC: B8:6B:23:7B:C3:79 (physical MAC of virtual PC-"A") IP layer: src.IP: 0.0.0.0.0 (Unspecified) dst.IP: 99.9.0.0.1 (Broadcast) HTL = 1, ToS = 0, Protocol = 1 (ICMP) ICMP layer: Type: 16 = DAD request Code: 2 (PowerDAD) IP-FF session ID= AAA src.MAC: B8:6B:23:7B:C3:79 (same as on Ethernet layer) Requested IP: 10.168.0.1.2 3.3.2. PowerDAD reply example Ethernet layer: dst.MAC: FF:FF:FF:FF:FF:FF (Broadcast) src.MAC: B8:6B:23:7B:C3:79 (physical MAC of virtual PC-"B") IP layer: src.IP: 192.168.0.0.1 (IP of PC-"B") dst.IP: 99.9.0.0.1 (Broadcast) HTL = 1, ToS = 0, Protocol = 1 (ICMP) ICMP layer: Type: 17 = DAD reply Code: 2 (PowerDAD) IP-FF session ID= BBB src.MAC: B8:6B:23:7B:C3:79 (same as on Ethernet layer) Requested IP: 192.168.0.1.2 (same as in request) Since the source MAC address is an exact duplicate, a reply MUST be sent. Duplicate MAC address detected ! This is despite different IP address, and different Node Multicast address. (no reply = all good) 3.4. SleepyDAD - Passive Duplicate Address Detection In addition to active probing, passive probing may take place with hints from various protocols, such as TCP and ICMP. If a node get multiple packets destined to it, to its logical MAC address, and its IP address, but not originated from it, it is a negative hint of a potential duplicate address. It should start the procedure of PowerDAD. This part is optional. 4. Limitation: What we do not detect ? If there is a duplicate (logical or physical) MAC address, not running IPFF it will go undetected by this algorithm. One idea to address this limitation, is to run IPv4 ARP listener as well as IPv6 Neighbor Discovery Advertisement listener, in parallel to IPFF, in Multi-stack operation. But Multi-stack listener is outside the scope of this spec. Additionally, if after IPFF stack is up-n-running, two broadcast domains with duplicate addresses were subsequently joined, we can't detect it in real-time. Duplicate physical MAC but different IP. NOTE: We don't care, since IPFF uses logical MAC addresses for traffic. We do not detect this scenario. 4.1. Other limitation: First Field Overlapping Because of one-to-one mapping, you cannot have overlapping the 1st field in the same broadcast domain. This is the "FFlap" problem. So something like this will fail badly due to logical MAC collision: 10.168.x.x.x/20 and 192.168.x.x.x/20 networks -or- 10.0.0.0.1 and 11.0.0.0.1 They will map to the same logical MAC address. Connected to the same broadcast domain (read: one switch). You have to be careful and separate such networks by a separate switch, or VLANs, or router ports. This can work with IPv4, but was considered a bad practice there. Won't work with IP-FF. Just keep this in mind, when designing IP-FF networks. 4.2. Dot1Q VLANs limitation A very fundamental limitation of LARA proposal, is that it partially merges Layer 2 (Ethernet) and Layer 3 (IP), and one big limitation resulting from it is that LARA is incompatible with the industry- standard IEEE 802.1q VLANs and IP address overlapping. (which can happen due to NAT or IP-VRF reasons) Worse yet, there is no way to detect such overlapping, because Duplicate Address Detection (DAD) cannot traverse VLANs. Dot1q VLAN Switches have one shared MAC address table, where a possible solution is to have per-VRF MAC address table, plus first field (10-bits). Similar issue applies to IP-VRF extension. Either a redesign of Ethernet VLANs or going to be required, -or- going back to more traditional methods, such as ARP. In it's current form, LARA is incompatible with Ethernet VLANs. 5. Booting IPFF stack When booting an IPFF stack, it must be put into "tentative mode", until DAD procedure is complete. Additionally, IPFF stack must Randomly-generate an IPFF Session ID, and "remember" it during an entire session, as well as its "physical MAC address", to answer DAD requests. Changing an IP address, either statically, via DHCP, or otherwise requires a new DAD procedure. Changing link up/down state also requires a new DAD procedure. It is recommended to periodically restart DAD procedure, say once every 120-240 minutes. Configurable by OS vendor. This partially solves the 2nd limitation. What to do when there is a duplicate address ? If a duplicate address detected during IPFF stack bootup, and address was manually configured, it SHOULD be shutdown, and error MUST reported to the user. (via log, GUI dialog, console, SNMP, syslog, or otherwise) If address was configured via DHCP, a new DHCP request needs to be sent after random delay, asking for the next IP address. If a duplicate address detected after IPFF stack bootup, it MUST be kept running, and error reported to the user. 5.1. Broadcast Advertisement 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4| CRC32 Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 8| Type | Code | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type: 20 = Broadcast advertisement. (will be assigned by IANA) Code: 0 If a node has re-connected to a broadcast network (like Ethernet), using the same IP subnet, it MAY send a broadcast advertising itself. This will force all Ethernet switches to learn it's logical MAC address, in case it has moved to a different segment (collision domain), so that all nodes can reach it. This is not needed on cold-boot devices. Source MAC = (logical MAC, derived from IP) Destination MAC = FF:FF:FF:FF:FF:FF (broadcast) Source IP (node's IP address) Destination IP = 0.0.0.0.0 (unspecified address, since this command is directed at Ethernet switches, rather than hosts) 6. Dual-Stack operation hints For working with both IPFF and IPv4/v6, the operating system may either use "promiscuous mode" of an Ethernet controller, to listen to multiple unicast MAC addresses, or move IPv4/v6 to a new logical MAC address, used by IPFF. If an Ethernet controller supports "standard mode" listening to multiple unicast MAC addresses, it should be chosen. 7. Private MAC addresses on Ethernet networks According to various sources on the Internet, the following MAC address ranges are considered private. Also known as "Locally Administered Address Ranges": x2-xx-xx-xx-xx-xx x6-xx-xx-xx-xx-xx xA-xx-xx-xx-xx-xx xE-xx-xx-xx-xx-xx Additionally all MAC addresses beginning with odd number in the first octet, are Multicast or Broadcast MAC addresses, and are all private. 8. Mapping of Multicast addresses Multicasts in IPFF begin with 99.9.x.x.x/20 Multicast MAC addresses must have first octet number odd. MAC addresses in IPFF will get 55:55:xx:xx:xx:xx (32 bits for nodes) Only 30 least significant bits will be mapped directly, and first 20 bits ignored. This is called a "Link Multicast Group"; LMG for short. Example: 99.9.0.0.4 (DHCP clients; our "Silent Multicast Listeners") -- all will get a "Link Multicast Group" MAC address of: 55:55:00:00:00:04 Because IGMP advertisement is not used for "Silent listeners", smart switches cannot do IGMP snooping, and will have to flood such packets on all ports, like broadcast. But a node's Ethernet controller, in "standard mode", can filter unnecessary traffic, without interrupting the CPU, gaining efficiency. 9. Theoretical Efficiency vs ARP/NDP LARA (Link Address Resolution Algorithm) is much more efficient than IPv4 ARP or IPv6 NDP; This is because efficiency of LARA has overhead of (O) [source nodes] while ARP/NDP have overhead of (O)^2. [source-destination pairs] O = amount of Objects (nodes) in the Ethernet broadcast domain. And the packets sent are due to DAD algorithm, not discovery per-se. Authors' Contacts Alexey Eromenko Israel Skype: Fenix_NBK_ EMail: al4321@gmail.com Facebook: https://www.facebook.com/technologov INTERNET-DRAFT Alexey expiration date: 2016-08-07