Network Working Group	E. Levy-Abegnoli
Internet-Draft	Cisco Systems
Intended status: Standards Track	August 04, 2009
Expires: February 5, 2010

Preference Level based Binding Table

Status of this Memo

This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on February 5, 2010.

Copyright Notice

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document.

Abstract

[fcfs] proposes a simple preference scheme to resolve binding entry collisions (same l3 address, different anchors): it keeps the first entry and rejects any others. However, there are cases where keeping the first entry is not the best choice, and others cases where it is bogus. This draft analyses what are these cases, and proposes a different algorithm (preference based) to fix the problem.

1. Problem Statement
2. Binding entry completion
3. Link-Layer Address anchor
4. Entry preference algorithm
    4.1. Preference Level
    4.2. Entry update algorithm
    4.3. Switch port configuration
5. Normative References
Appendix A. Contributors and Acknowledgments
§ Author's Address

TOC

1. Problem Statement

In the general case, when two nodes A and B compete for the same L3 address, the first one (A) to perform DAD sucessfully is the one that should "own" it. [RFC4862] enforces that behavior by enabling A to defend its address through the DAD procedure. Once A has succesfully completed DAD, if B wishes to use A's address, it is expected to run DAD and it will get a DAD response from A. At this stage, it should give up on the address. If B does not give up, it is faulty, and a third party device such as a savi-switch is entitled to drop its traffic. From there, [fcfs] proposes to install A's binding (L3 address/A's anchor) into the binding table, and drop traffic sent by B, sourced with the same L3 address.

While this is working in the general case, there are some scenarios where the first node to present a binding on the link is not the one that should ultimately owns it. A simple example is the one of SeND nodes.

Consider the case of a link with a mix of ND (RFC4861] and SeND (RFC3971] nodes. Consider node A running ND, node B and node C running SeND. Node A is malicious and knows about B's CGA address "b". It can also easily compute "b'", which is B secondary address, derived from the same RSA public key, with a colision count of 1. Assuming b and b' has not made it into the switch binding table yet (nodes are down), node A could advertizes b and b', without any CGA credentials, either by sourcing data packets or by DAD'ing them. Node A could also listen and respond to DAD messages sent by B. In either scenarios, According to [RFC3971], B is the legitimate owner of b', and C, as a SeND node, must consider it that way and bind b' to B's link-layer address.

However, a savi-switch implementing [fcfs] will pick up A as the owner of b and b'. B's packets will be filtered out, and C will only see A's claim and trust A as the legitimate owner, without a chance to make a different better choice. In the absence of a savi-switch, a better more secure situation would be established with B and C ignoring A's claims on b' if not on b.

It appears that on a link with a mix of nodes supporting ND and nodes supporting SeND, a switch using a first-come-first-serve algorithm will break SeND and endup decreasing the security rather than augmenting it. SeND mandates a different algorithm that first-come-first-serve, so should do the savi-switch.

It is possible to generalize and consider that it is sometimes desirable for the switch to pick up an address owner which is not the first one presenting the binding. It can be because it is a CGA address, or the address is one of a trusted server, or the address was assigned by DHCP. This draft proposes an algorithm based on the computation of a preference level, to choose between two competing binding entries.

TOC

2. Binding entry completion

In order to learn about CGA credentials associated with a binding candidate, the switch needs to read these credential out of an NDP message. Therefore, a data packet should not be used to complete the binding entry, but only as a hint that it should seek for completion. Upon receiving a data packet carrying a source address not seen before, the switch should issue a DAD packet on the link (all ports of the vlan), including the one from which the data packet was received. The address owner is expected to respond with an NA, carrying CGA credentials if any. Upon receiving this response, the switch can complete the binding entry and start forwarding traffic from the source.

TOC

3. Link-Layer Address anchor

So far, there has been several possible binding anchors listed, and one of them was the MAC address. However, it was not specified whether the MAC address was the one found at layer 2 (SMAC), or the link-layer-address (LLA) announced by NDP. It is important to note that these two are not always the same, and some tools uses that difference to play some game on the link. For instance, a node B could act as a binding-server, on behalf of nodes (S1, S2) within a server-farm, vis-as-vis clients on the link. B would announce, with its own SMAC and source, S1 or S2 link-layer address, associated with the anycast server-fam address, based on server load or other criterias.

Another scenario is the one of an attacker A, poisoning nodes neighbor cache, by sending NDP NA messages with SMAC=MAC_A, L3-src=A, target=B and LLA option=MAC_A. Such message carries two bindings, first one [A, MAC_A], and a second one [B, MAC_A]. While the first one may be legitimate, the second may not be (an entry [B, MAC_B] could exist in the switch binding table. Such message could easily redirect all traffic (for B) to A.

In the two scenarios considered above, the savi switch would have some need to analyse the content of the NDP message, up to the link-layer-address option, rather than only look at the message envelope (SMAC/IP-src). Only some NDP messages are carrying this option: NS (but not DAD), NA, RS, RA. Note that neither DAD packets nor data packets have the information needed. A scheme where the switch would issue a DAD NS upon receiving a data packet or a DAD message (with a delay to avoid DAD failure) would be useful to obtain an NA from the address owner, with the LLA option. This is yet another reason to "complete" the binding on NDP messages rather than on data packet.

TOC

4. Entry preference algorithm

TOC

4.1. Preference Level

The preference level (preflevel) is an attribute of an entry in the binding table. It is setup when the entry is learnt, based on where it is learnt from, the credentials associated with it and other criterias to-be-defined. The preflevel is used to arbitrage between two candidate entries (same l3 address) in the binding table. The higher the preference level is, the more preferred the entry.

One of the key factor of an entry preflevel is the port the binding was learnt from. For example, an entry would have different preflevels if it is learnt from:

An access port: it typically attaches to end-nodes
A trunk port: it attaches to a non savi-switch
A trusted access port: it attaches to trusted end-nodes
A trusted trunk: it attaches to another savi-switch

A second factor of the preflevel is the credentials associated with this learning. An entry associated with cryptographic proof (CGA) should be preferred over the same entry without this proof.

In Section 3 (Link-Layer Address anchor), it was highlighted that the source mac (SMAC) and the link-layer address (LLA) found in the same NDP message could be different. It maybe be useful to prefer binding entries carried in messages where the SMAC and the LLA are identical.

Binding entries can be learnt from various procotol, whether NDP, DHCP or statically assigned. It is useful to define a preference order within thse various source of binding.

Each factor of the binding preference level is given a value and referred to as preference value. For instance, a binding associated with CGA credential would be given a value "CGA_AUTHENTICATED". A binding received from a trusted access port would be given "TRUSTED_ACCESS".

The different preference values are not all exclusive (some are). For instance, an entry could be associated with CGA credentials, and received from a trunk port at the same time. It could -or not- verify SMAC=LLA. So we define the preference level of an entry as a sum of preference values. Preference values are flags, encoded as 2**n where n in the no of the flag. The higher n, the more prevalent is the value. The preference level of an entry the sum of all preference values that apply to it. The goal of this encoding is to ensure that the sum of preferences values 1 to N-1 is smaller than preference N. And that the same time, if preference values P < Q, a preference level of P + N is bigger than one of Q + N.

The following preflevel values have been identified (from lowest to highest):

LLA_MAC_MATCH: LLA (found in NDP option) and MAC (found at layer2) are identical;
TRUNK_PORT: the entry was learnt from a trunk port (connected to another switch)
ACCESS_PORT: the entry was leant from an access port (connected to a host)
TRUSTED_ACCESS: The entry was learnt from a trusted port
TRUSTED_TRUNK: The entry was learnt from a trusted trunk
DHCP_ASSIGNED: the entry is assigned by DHCP
CGA_AUTHENTICATED: The entry is CGA authenticated, per [RFC3972] (Aura, T., “Cryptographically Generated Addresses (CGA),” March 2005.)
CERT_AUTHENTICATED: the entry is authenticated with a certificate
STATIC: this is a statically configured entry per [RFC3971] (Arkko, J., Kempf, J., Zill, B., and P. Nikander, “SEcure Neighbor Discovery (SEND),” March 2005.).

Here are some examples of preference levels:

An entry learnt from a trunk port with SMAC matching LLA would have a preflevel TRUNK_PORT+LLA_MAC_MATCH bigger than one simply matching lla/mac (LLA_MAC_MATCH).
However an entry learnt from an access port with matching mac/lla would have a smaller preflevel than an entry learnt from a trusted port.

TOC

4.2. Entry update algorithm

Once an entry is installed in the binding table, its attributes cannot be changed without complying with this “entry update algorithm”.

The algorithm is as follows, starting with rule_1, up to rule_5, in that order until one rule is satisfied: Updating an entry attribute is:

Allowed when the preflevel carried by the change is bigger than the preflevel stored in the entry.
Denied if the preflevel carried by the change is smaller than the preflevel stored in the entry
Allowed if preflevel >= TRUSTED_PORT
Denied is the entry is in state REACHABLE (and preflevel are equal)
Allowed otherwise

Entry state is a useful part of the preference algorithm. Three states have been identified: INCOMPLETE, REACHABLE and STALE. An entry is INCOMPLETE when the entry has not been verified thru an NDP (NS/NA) exchange. For instance, upon receiving a data packet, the switch will issue a DAD NS and wait for an NA. Before receiving the NA, the entry is IMCOMPLETE. Then it moves to REACHABLE. Then to STALE unless the binding is confirmed by more NDP traffic.

The switch may, by policy, deny binding completion for an entry in state INCOMPLETE if the change is not associated with the port where this entry was first learnt from. This will eventually break DAD, but will provide some mitigation against DAD attacks. By default, the switch should listen to DAD response coming from any port.

For movement, if an entry is in REACHABLE, it means it was seen recently in its known location, and any binding takeover is considered as an attack. However, If the binding takeover is received from a trusted port, it should be allowed, regardless of the entry state.

TOC

4.3. Switch port configuration

Qualifying a port of the switch is of primary importance to influence the “entry update algorithm” (see Section 4 (Entry preference algorithm)). The switch configuration should allow the following values to be configured on a per-port basis:

TRUNK_PORT: the port of the switch is connected to another switch port, that is not a plb-switch.
ACCESS_PORT: the port of the switch is connect to an end-node.
TRUSTED_PORT: the port of the switch is connected to a trusted end-node.
TRUSTED_TRUNK: the port of the switch is connected to another plb-switch.

TOC

5. Normative References

[RFC3971]	Arkko, J., Kempf, J., Zill, B., and P. Nikander, “SEcure Neighbor Discovery (SEND),” RFC 3971, March 2005 (TXT).
[RFC3972]	Aura, T., “Cryptographically Generated Addresses (CGA),” RFC 3972, March 2005 (TXT).
[RFC4861]	Narten, T., Nordmark, E., Simpson, W., and H. Soliman, “Neighbor Discovery for IP version 6 (IPv6),” RFC 4861, September 2007 (TXT).
[RFC4862]	Thomson, S., Narten, T., and T. Jinmei, “IPv6 Stateless Address Autoconfiguration,” RFC 4862, September 2007 (TXT).
[fcfs]	Nordmark, E. and M. Bagnulo, “First-Come First-Serve Source-Address Validation Implementation,” draft-ietf-savi-fcfs-01 I-D, March 2009.

TOC

Appendix A. Contributors and Acknowledgments

This draft benefited from the input from: Pascal Thubert.

TOC

Author's Address

	Eric Levy-Abegnoli
	Cisco Systems
	Village d'Entreprises Green Side - 400, Avenue Roumanille
	Biot-Sophia Antipolis - 06410
	France
Email:	elevyabe@cisco.com