Network Working Group E. Levy-Abegnoli Internet-Draft Cisco Systems Intended status: Standards Track June 2, 2009 Expires: December 4, 2009 Preference Level based Binding Table Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on December 4, 2009. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Abstract A trusted database located on the first switch, storing the binding between end-nodes Link-Layer-Addresses (LLA) and their IPv6 addresses would be an essential part of source address validation. To build Levy-Abegnoli Expires December 4, 2009 [Page 1] Internet-Draft Preference Level based Binding Table June 2009 such a database, one must: 1. Describe the source of information 2. How the information is maintained 3. How the collisions are resolved. and solutions would differ by one or more of these elements. While also getting its binding data from NDP, this draft proposes an alternate to "first-come-first-serve" basis [fcfs]), by specifying a preference algorithm to deal with collisions. Instead of the simplistic first-come first serve collision handling, the proposed algorithm relies on the following criterias to choose between two coliding entries: o Where the entries were learnt from (access port, trunk port, etc) o Credential carried by the entries (CGA proof, certificate, mac/lla match, etc.) o State of the current entry o Age of the entry Since the state of the entry is one of the element of the algorithm, this draft also describes a tracking mechanism to maintain entries in states where the preference algorithm can enable end-nodes movement. Levy-Abegnoli Expires December 4, 2009 [Page 2] Internet-Draft Preference Level based Binding Table June 2009 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Goals and assumptions . . . . . . . . . . . . . . . . . . . . 4 2.1. Definitions and Terminology . . . . . . . . . . . . . . . 4 2.2. Scenarios considered . . . . . . . . . . . . . . . . . . . 4 3. Source of information . . . . . . . . . . . . . . . . . . . . 5 4. Binding table . . . . . . . . . . . . . . . . . . . . . . . . 6 4.1. Data model . . . . . . . . . . . . . . . . . . . . . . . . 6 4.2. Entry preference algorithm . . . . . . . . . . . . . . . . 7 4.2.1. Preference Level . . . . . . . . . . . . . . . . . . . 7 4.2.2. Entry update algorithm . . . . . . . . . . . . . . . . 8 4.2.3. Enabling slow movement . . . . . . . . . . . . . . . . 8 4.3. Binding entry tracking . . . . . . . . . . . . . . . . . . 9 4.4. Binding table state machine . . . . . . . . . . . . . . . 9 5. Configuration . . . . . . . . . . . . . . . . . . . . . . . . 12 5.1. Switch port configuration . . . . . . . . . . . . . . . . 12 5.2. Binding table configuration . . . . . . . . . . . . . . . 12 6. Bridging NDP traffic . . . . . . . . . . . . . . . . . . . . . 13 6.1. Bridging DAD NS . . . . . . . . . . . . . . . . . . . . . 13 6.2. Bridging other NDP messages . . . . . . . . . . . . . . . 17 7. Normative References . . . . . . . . . . . . . . . . . . . . . 18 Appendix A. Contributors and Acknowledgments . . . . . . . . . . 18 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 18 Levy-Abegnoli Expires December 4, 2009 [Page 3] Internet-Draft Preference Level based Binding Table June 2009 1. Introduction To populate the first-switch binding table, this document propose a scheme based on NDP snooping, and introduces a preference level algorithm to deal with collisions. It is organized as follows: o Section 3) describes the source of information and Section 4.1 the binding table data model. While the proposed approach would fit multiple sources of bindings, such as Neighbor Discovery Protocol (NDP) snooping, DHCP (snooping), MLD (snooping) and static entries, this document focuses on NDP. Section 6) details how an L2-switch can leverage NDP DAD messages to populate it table o Entries lifecycle must be fully controlled by the switch. Section 4.4 details how this is achieved. This include some mechanism to test entries reachability, described in Section 4.3. o The resolution of collisions is detailed in Section 4.2. 2. Goals and assumptions The primary goal of the proposed approach is for the layer2 switch to maintain an accurate view of the nodes attached to it, directly or via another layer2 switch. This view is referred to as the switch "binding table". The following goals are also looked at: o Enable nodes (slow) movement o Prevent binding address spoofing The binding table includes the nodes IPv6 address, link-layer address, switch port they were leanrt from, whether an access port or a trunk port (port to another switch). This binding table is the keystone to detect and arbitrage in case of collisions. It also brings a couple of interesting by-products: it can provide some address spoofing mitigation, and it can be used to limit multicast traffic forwarding. 2.1. Definitions and Terminology The following teminology is being used: plb-switch: A switch that implement the algorithms described in this draft 2.2. Scenarios considered Three main scenarios are considered in this document: Levy-Abegnoli Expires December 4, 2009 [Page 4] Internet-Draft Preference Level based Binding Table June 2009 1. Scenario A: a plb-switch connected to a set of L3-nodes, whether hosts of routers +------+ |HostA +-----------------+ +------+ | | +-----+------+ +------+ | | |HostB +-----------+ SWITCHA + +------+ | | +-----+------+ | +------+ | |HostC +-----------------+ +------+ 2. Scenario B: a plb-switch SWITCH_A connected to L3-nodes and to another plb-switch switch_B +------+ +------+ |HostA +-----------------+ +--------------+HostD | +------+ | | +------+ | | +-----+------+ +-----+------+ +------+ | | | | +------+ |HostB +-----------+ SWITCHA +-----+ SWITCHB +-------|HostE | +------+ | | | | +------+ +-----+------+ +-----+------+ | | +------+ | | +------+ |HostC +-----------------+ +--------------+HostF | +------+ +------+ 3. Scenario C: a plb-switch switch_A connected to L3-nodes and to a non plb-switch switch_B 3. Source of information Basically, there should be the following source of data for filling the table: o Neighbor Discovery, address initialization: when a node performs address initialization, it sends a DAD (Duplicate Address Detection) Neighbor Solicitation (NS) message. The procedure is described in RFC4861 [RFC4861] and RFC4862 [RFC4862]. This Levy-Abegnoli Expires December 4, 2009 [Page 5] Internet-Draft Preference Level based Binding Table June 2009 message does not contain a link-layer address option. However, the switch can find out the MAC address the DAD NS was sent from, as well as the port it was received on, and use that to create an entry in the binding table. It can also issue its own DAD NS to the sender to trigger it to send a Neighbor Advertisement (NA) carrying the binding information needed. Note that even nodes which get their address from DHCPv6 should perform DAD to validate it. Quite commonly, upon finishing address initialization, a node will send an unsolicited NA (to all-nodes) to announce the address. The switch can learn the binding from this message as well. o Neighbor Discovery, address resolution: during the address resolution exchange, the owner of an address is going to announce the binding with its link-layer address. This information could be seen by the switch, and used to fill the binding table. o Neighbor Discovery, other messages: there are plenty of NDP messages that carry the binding between the IPv6 layer3 address, and the link-layer address in a Source Link-Layer Address option (SLLA).These messages can also be snooped by the layer2 switch to learn the binding. Note that the binding information can also be learnt from other protocol sources such as DHCP or even be configured statically on the switch. This is outside the scope of this document to detail how this would be performed. However, binding table entries learnt by non-NDP methods might collide with entries leant via NDP snooping, and section [address collision resolution] describes how to prefer one entry over another one. 4. Binding table A table is maintained on the switch(es) that binds layer3 (IPv6) address and link-layer address (MAC). 4.1. Data model A record of the binding table should contain the following information: o v6addr: Layer 3 address o zoneid : the zone id o port: layer2 interface from which the entry was learn o vlanid : LAN identifier the address belong to. o lla: Link-Layer Address (mac) o preflevel: Preference level for this entry o state: entry state Levy-Abegnoli Expires December 4, 2009 [Page 6] Internet-Draft Preference Level based Binding Table June 2009 o lifetime: Lifetime of the entry o timestamp: Time of last update A global scope address should be unique across ports and vlans. A link-local scope address is unique within a vlan. Therefore, the database is a collection of l3-entries, keyed by ipv6-address and zoneid. A zoneid is a function of the scope of the address (LINK- LOCAL, GLOBAL) and the vlanid: o for scope GLOBAL, zoneid = 0 o for scope LINK-LOCAL, zoneid = vlanid A collision between an existing entry and a candidate entry would occur if the two entries have the same v6addr and zoneid.These field are referred to as the "key" The fields of an entry other than the key (port, vlanid, lla, etc) will the referred to as attributes. Changing attributes of an entry require complying with the Entry update algorithm described in Section 4.2. 4.2. Entry preference algorithm 4.2.1. Preference Level The preference level (preflevel) is an attribute of an entry in the binding table. It is setup when the entry is learnt, based on where it is learnt from, the credentials associated with it and other criterias to-be-defined. The preflevel is used to arbitrage between two candidate entries (with identical key) in the binding table. The higher the preference level is, the more preferred the entry. One of the key elements of the preflevel associated to an entry is the port is was learnt from. For example, an entry would have different preflevels if it is learnt from: o An access port: it typically attaches to end-nodes o A trunk port: it attaches to a non plb-switch o A trusted access port: it attaches to trusted end-nodes o A trusted trunk: it attaches to another plb-switch Another important element is the credentials associated with this learning. An entry could be associated with cryptographic proof (CGA), and/or the LLA learnt could match the source MAC of the frame from which it was learnt. The following preflevel values have been identified (from lowest to highest): Levy-Abegnoli Expires December 4, 2009 [Page 7] Internet-Draft Preference Level based Binding Table June 2009 o LLA_MAC_MATCH: LLA (found in NDP option) and MAC (found at layer2) are identical; o TRUNK_PORT: the entry was learnt from a trunk port (connected to another switch) o ACCESS_PORT: the entry was leant from an access port (connected to a host) o TRUSTED_ACCESS: The entry was learnt from a trusted port o TRUSTED_TRUNK: The entry was learnt from a trusted trunk o DHCP_ASSIGNED: the entry is assigned by DHCP o CGA_AUTHENTICATED: The entry is CGA authenticated, per [RFC3972] o CERT_AUTHENTICATED: the entry is authenticated with a certificate o STATIC: this is a statically configured entry per [RFC3971]. An entry can sum up preference values, for instance it could be TRUNK_PORT + LLA_MAC_MATCH. However, the preference level value should be encoded in such a way that the sum of preferences 1 to N-1 is smaller that preference N. For example: o An entry learnt from a trunk port with matching lla/mac would have a bigger preflevel than one simply matching lla/mac. o However an entry learnt from an access port with matching mac/lla would have a smaller preflevel than an entry learnt from a trusted port. 4.2.2. Entry update algorithm Once an entry is installed in the binding table, its attributes cannot be changed without complying with this "entry update algorithm". The algorithm is as follows, starting with rule_1, up to rule_5, in that order until one rule is satisfied: 1. Updating of an entry is allowed when the preflevel carried by the change is bigger than the preflevel stored in the entry. 2. Updating of an entry is denied if the preflevel carried by the change is smaller than the preflevel stored in the entry 3. Updating of an entry in state INCOMPLETE is denied if the change is not associated with the port where this entry was first learnt from. 4. Updating of an entry is denied if the preflevel carried by the change is equal to the preflevel stored in the entry, and the entry is in state REACHABLE or VERIFY (see Section 4.4) 5. Updating an entry is allowed otherwise. 4.2.3. Enabling slow movement It is quite a common scenario that an end-node is moving from one port of the switch to another one, or to a different switch. It is also possible that the end-node is updating its hardware and start Levy-Abegnoli Expires December 4, 2009 [Page 8] Internet-Draft Preference Level based Binding Table June 2009 using a different MAC address. There are two paradoxical goals with the trusted binding table: insure entry ownership and enable movement. The former drives the locking of the address, mac, and port alltogether, and prevent updates other than on the base of preference. It also works a lot better when entry lifetime is very long or infinite. The latter requires that a node can easily move from one port to another one, from one mac to another one. Enforcing address ownership will tend to lead to rejection of any movement and classify it as an attack. The algorithm described in Section 4.2.2, conbined with the capability to manage entry states reviewed in Section 4.4 enables end-nodes to move from on switch port to another port (or one mac to another) under three scenarios: 1. The node disconnect from its original port at least for T1 (T1 is configurable as described in Section 5. and the move does not lead to a less preferred entry 2. The node disconnect at least for T3 (T3 is also configurable). 3. The entry seen after the node moves is "prefered", for instance because the node moved from an ACCESS_PORT to a TRUSTED_PORT. Note that movement driven bu T1 is tied up to the accuracy of the REACHABILIY state. Maintaining this state with some entry tracking mechanism as described in Section 4.3 is going to it a lot more efficient. 4.3. Binding entry tracking In order to maintain an accurate view of the devices location and device state, which is a key element of the binding table entry preference algorithm, an entry tracking mechanism can be enabled. The tracking of entries is performed on a per-port basis, per IPv6 address basis, by "layer-2 unicasting" DAD NS on the port the address was first learnt from, to the Destination MAC (DMAC) known to be bound to that address. The DMAC can be learnt from the LLA option carried in some NDP messages, configured statically, or, in last resort, from the source mac (SMAC) address of NDP messages referring to that address. In the case of NDP messages not sourced with UNSPECIFIED address, that would be the source address of the messages. In the case of DAD NS, that would be the target address 4.4. Binding table state machine The entry lifecycle is driven by the switch, not by NDP: this is especially important to insure that entries are kept as long as needed in the table rather than following the rule of the ND cache, Levy-Abegnoli Expires December 4, 2009 [Page 9] Internet-Draft Preference Level based Binding Table June 2009 dictated by other requirements. Typically, an entry will be created INCOMPLETE, move to REACHABLE when binding is known, move back and forth from REACHABLE to VERIFY if tracking is enabled, at some point move to STALE when the device (the address owner) stop talking on the link. The entry could stay in that state for very long, sometime forever depending on the configuration (see "configuration" section. Four states are defined: 1. INCOMPLETE: an entry is set in this state when it does not have the L3/L2 binding yet. This happens when an entry is created without the LLA. Typically, such entry is created when an end- node coming up sends a DAD NS to verify address uniqueness (DAD NS don't carry SLLA option). Creating an entry in that state still requires an L3 address, found in the target field of the NS DAD, or in the source field for any other message. While the entry is created INCOMPLETE, the switch waits T0 to avoid collision. Then it unicast a DAD NS on the port were the first message was seen, to the SMAC address found in the received frame. In the absence of a response, the DAD NS is retried every T0 up to R0 times. There are two ways to get out of that state * After R0 retries without seeing a response, the entry is deleted * A response is received, carrying an SLLA option. The entry moves into REACHABLE. * The LLA is received in any other message seen on that port. The entry moves into REACHABLE. 2. REACHABLE: As soon as the LLA is learnt, the entry moves to REACHABLE and, if tracking is enabled, a timer T1 is started ("configuration" Section 5). Upon T1 expiration, the entry moves into VERIFY state. If tracking is not enabled, the entry remains T6 at most in that state without any reachability hint (obtain with NDP inspection or other features, before moving to STALE. 3. VERIFY: In this state, a binding is known (L3/L2) but must be verified. A DAD NS is unicast to the L3/L2 destinations and a timer T2 is started. There are two ways to get out of that state: * T2 expires: the entry is moved to STALE after R2 retries. * NA is received: the binding can move back to REACHABLE 4. STALE: when getting into that state; a timer T3 is started based on the configuration (see "configuration" section). Upon expiry, the entry is deleted. The binding table state machine is as follows: Levy-Abegnoli Expires December 4, 2009 [Page 10] Internet-Draft Preference Level based Binding Table June 2009 T0 E1 +------+ send DAD-NS +----------+ | | increment r0 | | | V | | +---+--------------+ +--------------+---+ | | | E1 | |<-----+ | INCOMPLETE +----------------->| REACHABLE | | | T1 | | | | /--------------+ | +-----+------------+ / +------+-----------+ |R0 / A | A | / / | | V / / |T1 |E1 delete / / | | V / V | +------------------+ E1 / +------------+-----+ | +--------------- | |T3 | VERIFY | R2 | STALE +---> delete | +----------------->| | | | | | +---+--------------+ +------------------+ | A | | send DAD-NS +------+ increment r2 T2 The following events are driving the state transitions: o E1: A link-layer -address (LLA) was received, for the L3 address o T0: Timer expired. Time an entry wait for any binding message (NA, etc.) in INCOMPLETE state before another NS is sent, up to INCOMPLETE_MAX_RETRIES o T1: Timer expired. Time an entry stays in REACHABLE state until we start verifying (polling) it or moving to STALE. o T2: Timer expired. Time an entry wait for any binding message (NA, etc.) in VERIFY state before another NS is sent, up to VERIFY_MAX_RETRIES o T3: Timer expired. Time an entry is left in STALE state until it is deleted or a binding message is received. o R0: Exhaustion of INCOMPLETE_MAX_RETRIES o R2: Exhaustion of VERIFY_MAX_RETRIES Default values are as follows: o T0: 3 seconds o T1: 300 seconds o T2 : 10 seconds o T3 : 24 hours Levy-Abegnoli Expires December 4, 2009 [Page 11] Internet-Draft Preference Level based Binding Table June 2009 o INCOMPLETE_MAX_RETRIES: 3 o VERIFY_MAX_RETRIES: 3 All the default values should be overridden-able by configuration. 5. Configuration 5.1. Switch port configuration Qualifying a port of the switch is of primary importance to influence the "entry update algorithm" (see Section 4.2). The switch configuration should allow the following values to be configured on a per-port basis: o TRUNK_PORT: the port of the switch is connected to another switch port, that is not a plb-switch. o ACCESS_PORT: the port of the switch is connect to an end-node. o TRUSTED_PORT: the port of the switch is connected to a trusted end-node. o TRUSTED_TRUNK: the port of the switch is connected to another plb- switch. 5.2. Binding table configuration The following elements, acting on the binding table behavior, should be configurable, globally or on a per-port basis: 1. T0: (global) frequency at which the switch is unicast DAD NS to obtain an INCOMPLETE entry link-layer address. Default is three seconds. Associated configuration elements are: * INCOMPLETE_MAX_RETRIES (R0), which is the maximum number of NS sent by the switch before deleting the entry. Default is 3. 2. T1: (per-port) maximum reachable lifetime is the time an entry is kept in REACHABLE without sign of activity, before transitioning to VERIFY (if "tracking on") or STALE otherwise. T1 may be set to "infinite". Default value is 300 seconds. 3. Tracking on/off: (per-port) when turned on, it enables the tracking of entries in the binding table. Reachability of entries is then tested every T1 by unicasting (at layer2) DAD NS (unless reachability is established indirectly by NDP inspection). Associated configuration elements are: * T2: (global) verify-interval is the waiting time between re- sending the DAD NS up to R2 times. Default value for T2 is 10 seconds * VERIFY_MAX_RETRIES (R2), is the maximum number of DAD NS the switch will unicast to the entry owner before moving the entry to STALE. Default value for R2 is three times. Levy-Abegnoli Expires December 4, 2009 [Page 12] Internet-Draft Preference Level based Binding Table June 2009 4. T3: (per-port= maximum stale lifetime is the time an entry is kept in STALE without sign of activity, before being deleted from the binding table. T3 may be set to "infinite". Default value is 24 hours. 6. Bridging NDP traffic One important aspect of an "NDP-aware" switch is to efficiently bridge the NDP traffic to destinations. In some areas, the switch might have a behavior different from a regular non plb-switch: 1. When intercepting an NDP message carrying binding information, the switch can lookup its binding table, decide the message is not worth bridging and drop it. This may be the case when a binding entry already exist and is not consistent with the one being received. 2. When the received message is a DAD NS for a target the switch has a pending (incomplete) entry, received from a different port, the switch may decide to drop it. If it came "second", in the (small) window during which the switch is attempting to track the entry, it suggest this might be an attack. 3. When intercepting a multicast NDP message, such as a DAD NS, for which it already has an entry in its binding table, the switch may decide to forward it only to the target owner. 4. When receiving a DAD NS or other multicast NDP messages, a switch enable for MLD snooping might decide to prevent the bridging of the message on trunk ports to other switches (based on MLD report received on these port). The plb-switch however may decide to force a copy of these messages on these trunks, to insure the other switch is able to populate its own binding table. This behavior should be configurable on a per-port basis. The general bridging algorithm is as follows. When an NDP message is received by the layer2 switch, the switch extracts the link-layer information, if any. If no LLA is provided, the switch should bridge normally the message to destination. If LLA is provided, the switch can lookup its binding table for this entry. If no entry is found, it creates one, and bridges the message normally. If an entry is found with attributes consistent with the ones received (port, zoneid, etc), it should bridge the message normally. If the attribute are not consistent, and a change is allowed (see Section 4.2), it should update the attributes and bridge the message. If the change is disallowed, it should drop the message. 6.1. Bridging DAD NS Bridging DAD NS is critical to both security and binding table distribution. Flows below study some relevant cases. Levy-Abegnoli Expires December 4, 2009 [Page 13] Internet-Draft Preference Level based Binding Table June 2009 In scenario A, the switch SWITCH_A has only end-nodes connected to it. Scenario A: +--------+ +--------+ +--------+ +--------+ | host 1 + |SWITCH_A| |host 2 | | host 3 | +--------+ +--------+ +--------+ +--------+ | | | | | switch up | | | | DAD NS tgt=X | | | |<------------------+ | | no hit | | | X stored, pref=ACCESS | | | | | | | DAD NS tgt=X conditional forward (1) | | |<------------------O------------------------------------->| | NA | | | |------------------>| | | | hit, newpref=ACCESS | | | do not replace | | | drop | | | | | | | | ... | | | | | DAD NS tgt=X | | |<-------------------------------------| | hit, newpref=ACCESS | | | forward to owner | | | |------------------>| | | | | | | DAD NS tgt=X conditional forward (1)| | |<------------------| | | | replace | | | NA | | | |<------------------| | | | | | | | | | | When nodes come up, the switch is assumed to be already up. As the result of it, since the switch stores entries for all addresses it snoops, it is going to have a fairly accurate view of the nodes (addresses). Host 2 comes up, and sends a DAD NS for target X, intercepted by the switch. Switch_A does not have X in its binding table, stores it (INCOMPLETE), and bridges it to other nodes host1 and host3. If MLD snooping is in effect, the switch might decide not to forward it at all (no other known group listener for the solicited-node multicast group), or only to a few hosts. Regardless of MLD snooping, flow (1) is not absolutely "useful" and could even Levy-Abegnoli Expires December 4, 2009 [Page 14] Internet-Draft Preference Level based Binding Table June 2009 be harmful. If we assume the switch knows all addresses of the link/ vlan, then it knows nobody owns yet this address. In that case, sending it to other hosts would be an invite for an attack. There is a tradeoff between two issues which are not equally probable: a risk to break DAD and a risk to be vulnerable to a DoS on address resolution. The latter is well understood: should the switch broadcast DAD NS, an attacker can immediately claim ownership with an NA. As far as the former, it would happen if following conditions are met: 1. The initial DAD NS for X, and any subsequent NDP packets (NA to all-nodes, etc) were missed by the switch. 2. In addition: * the newly received NS carries a duplicate address. * or host2 is the attacker, however he could not have seen X yet, since the switch has not. So he would have to know it from non-trivial means. In scenario B, SWICTH_A is also connected to a second switch SWITCH_B, which runs the same logic to populate its own binding table. Levy-Abegnoli Expires December 4, 2009 [Page 15] Internet-Draft Preference Level based Binding Table June 2009 Scenario B: +--------+ +--------+ +--------+ +--------+ | host 1 + |SWITCH_A| |SWITCH_B| | host 2 | +--------+ +--------+ +--------+ +--------+ | | switch up | | | | DAD NS tgt=X | | | |<-----------------| | | No hit, no trunk up | | switch up X stored in Bt, pref= ACCESS | | | | | DAD NS tgt=X | | | |------------------>| | | | no hit | | | X stored, pref=ACCESS | | | forward on trunk (2) | | | |------------------>| | | | hit (host2) | | | | forward to owner | | | |----------------->| | | | NA | | | |<-----------------| | | hit, owner | | | NA forward on trunk | | |<------------------| | | hit, newpref=TRUSTED_TRUNK | | replace | | | NA | | | |<------------------| | | | | | | | | | | When SWITCH_A comes up, it may come after SWITCH_B. In this case, it is unaware about end-nodes attached to SWITCH_B. SWITCH_B however knows all of them, with the same assumptions as in scenario A. Upon receiving a DAD NS for target X, and in the absence of a hit, SWITCH_A creates an INCOMPLETE entry and forwards it to SWITCH_B. 1. If SWITCH_B has it in its table, then it can forward it only on the interface of X's owner (host2). Host2 responds, and response reaches SWITCH_A. SWITCH_A has already an entry for X associate with interface to host1, while this one is received from the trunk. The trunk is a TRUSTED_TRUNK, hence entries received over it are preferred. SWITCH_A updates its binding table and propagate to host1. This is the case of a valid address duplication. 2. If SWITCH_B receiving the DAD NS over the trunk, does not have X in its table, it can drop the NS, while creating an INCOMPLETE entry for X. Or it can broadcast locally (with the same reasoning Levy-Abegnoli Expires December 4, 2009 [Page 16] Internet-Draft Preference Level based Binding Table June 2009 as for the previous scenario). Scenario C connects SWITCH_A to a SWITCH_B that does not run the same binding table alrorigthm (referred to as a non plb-switch). In this scenario, SWITCH_A forwarding on the trunk a DAD NS for target X. Configuration should tell whether any response coming from SWITCH_B is to be trusted (in the lack of better credential such as CGA/RSA proof). If SWITCH_B is fully trusted, then the trunk is configured as "TRUSTED_TRUNK" and scenario B applies. Otherwise, the trunk is configured as "TRUNK" and response is ignored. Scenario C: +--------+ +--------+ +--------+ +--------+ | host 1 + |SWITCH_A| |SWITCH_B| | host 2 | +--------+ +--------+ +--------+ +--------+ | | switch up | | | | DAD NS tgt=X | | | |<-----------------| | | | | | switch up | | | | | | | DAD NS tgt=X | | | |------------------>| | | | no hit | | | X stored, pref=ACCESS | | | |------------------>| | | | | to group | | | |----------------->| | | | NA | | | |<-----------------| | | NA | | | |<------------------| | | hit, newpref=TRUNK | | | do not replace | | | drop NA | | | | | | | | | | | | | | 6.2. Bridging other NDP messages When running the proposed binding table populate algorithm, switches are expected to have an accurate view of end-nodes attached to them. While scenario C is problematic, scenario A and B are clearer. If a switch has an entry in its table that conflicts with binding observed in an NDP message just received, it should drop the message (if new Levy-Abegnoli Expires December 4, 2009 [Page 17] Internet-Draft Preference Level based Binding Table June 2009 data has a smaller preflevel) or update its entry and bridge the message. If the switch does not have such entry, it should create the entry and bridge the message, including to trunks. In the case of multicast messages, it should bridge it on trunks regardless of group registration, to give a chance to other switch to buildup a more accurate binding table. 7. Normative References [RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure Neighbor Discovery (SEND)", RFC 3971, March 2005. [RFC3972] Aura, T., "Cryptographically Generated Addresses (CGA)", RFC 3972, March 2005. [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, September 2007. [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless Address Autoconfiguration", RFC 4862, September 2007. [fcfs] Nordmark, E. and M. Bagnulo, "First-Come First-Serve Source-Address Validation Implementation", draft-ietf-savi-fcfs-01 I-D, March 2009. Appendix A. Contributors and Acknowledgments This draft benefited from the input from: Pascal Thubert. Author's Address Eric Levy-Abegnoli Cisco Systems Village d'Entreprises Green Side - 400, Avenue Roumanille Biot-Sophia Antipolis - 06410 France Email: elevyabe@cisco.com Levy-Abegnoli Expires December 4, 2009 [Page 18]