TRILL Working Group Radia Perlman INTERNET-DRAFT Sun Microsystems Intended status: Proposed Standard Silvano Gai Nuova Systems Dinesh G. Dutt Cisco Donald Eastlake 3rd Motorola Laboratories Expires: January 2008 July 2007 Rbridges: Base Protocol Specification --------- ---- -------- ------------- Status of This Document By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Distribution of this document is unlimited. Comments should be sent to the TRILL working group mailing list. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 1] INTERNET-DRAFT RBridge Protocol Abstract RBridges allow for optimal pair-wise forwarding with zero configuration, safe forwarding even during periods of temporary loops, and multipathing for both unicast and multicast traffic. They achieve these goals using IS-IS routing and encapsulation of traffic with a header that includes a hop count. RBridges are compatible with previous IEEE 802.1 bridges as well as current IPv4 and IPv6 routers and end nodes. They are as invisible to current IP routers as bridges are and, like routers, they terminate the bridge spanning tree protocol. The design supports VLANs, and, optionally, optimization of the distribution of multi-destination frames based on VLAN and IP derived multicast groups. It also allows forwarding tables to be based on RBridge destinations (rather than end node destinations), which allows internal forwarding tables to be substantially smaller than in conventional bridge systems. Acknowledgements Many people have contributed to this design, including, in alphabetic order, Alia Atlas, Caitlin Bestler, Stewart Bryant, James Carlson, Dino Farinacci, Don Fedyk, Eric Gray, Erik Nordmark, Sanjay Sane, and Joe Touch. We invite you to join the mailing list at http://www.postel.org/rbridge. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 2] INTERNET-DRAFT RBridge Protocol Table of Contents Status of This Document....................................1 Abstract...................................................2 Acknowledgements...........................................2 Table of Contents..........................................3 Table of Contents Continued................................4 Table of Contents Continued................................5 1. Introduction............................................6 1.1 Algorhyme V2, by Ray Perlner...........................7 1.2 Conventions used in this document......................7 2. RBridges................................................8 2.1. RBridge Architecture..................................9 2.2 RBridges and VLANs....................................11 2.3 Forwarding of Different Frame Types...................12 2.3.1 Known-Unicast.......................................12 2.3.2 Multi-destination...................................12 3. Details of the TRILL Header............................14 3.1 TRILL Header Format...................................14 3.2 Version / Pruning (V).................................14 3.3 Multi-destination (M).................................15 3.4 Hop Count.............................................15 3.5 RBridge Nicknames.....................................16 3.5.1 Egress RBridge Nickname.............................16 3.5.2 Ingress RBridge Nickname............................17 3.5.3 RBridge Nickname Allocation.........................17 3.6 TRILL Header Options..................................18 4. Other RBridge Design Details...........................20 4.1 Ethernet Data Encapsulation...........................20 4.1.1 VLAN Tag Information................................21 4.1.2 Outer VLAN Info.....................................22 4.1.3 Inner VLAN Info.....................................23 4.1.4 Frame CheckSum (FCS)................................24 4.2 Link State Protocol (IS-IS)...........................24 4.2.1 IS-IS RBridge Identity..............................25 4.2.2 Distinguishing IS-IS Instances......................25 4.2.3 TRILL IS-IS Information.............................25 4.2.3.1 Core IS-IS Information............................26 4.2.3.2 Optional Per-VLAN IS-IS Instance Information......27 4.2.4 Designated RBridge..................................27 4.3 Distribution Trees....................................29 4.3.1 Distribution Tree Calculation and Checks............30 R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 3] INTERNET-DRAFT RBridge Protocol Table of Contents Continued 4.3.2 Pruning the Distribution Tree.......................31 4.3.3 Forwarding Using a Distribution Tree................32 4.4 Forwarding Behavior...................................33 4.4.1 Receipt of a Native Frame...........................33 4.4.1.1 Native Unicast Case...............................33 4.4.1.2 Native Multicast and Broadcast Frames.............34 4.4.2 Receipt of a Non-Native (TRILL) Frame...............34 4.4.2.1 TRILL IS-IS Frames................................35 4.4.2.2 TRILL Data Frames.................................35 4.4.2.2.1 Unicast TRILL Data Frames.......................35 4.4.2.2.2 Multi-Destination TRILL Data Frames.............36 4.4.3 Tree Distribution Optimization......................37 4.5 IGMP, MLD, and MRD Learning...........................38 4.6 Learning End Station Addresses........................38 4.7 Shared VLAN Learning..................................40 5. Pseudo Code............................................41 5.1 802MUL Destination Frames.............................41 5.1.1 Spanning Tree Protocol..............................43 5.1.2 Media Multicast Frames..............................43 5.1.3 802.1X Frames.......................................43 5.1.4 802.1AB Frames......................................44 5.1.5 GARP, GMRP, and GVRP................................44 5.1.6 Other Bridge Frames.................................45 5.2 Processing a Frame Received by an RBridge.............45 5.2.1 Further Dispatch for IP Frames......................46 5.2.2 Common Subroutines..................................46 5.2.2.1 Learn Source MAC Address..........................47 5.2.2.2 TRILL Data Frame Multi-destination Forwarding.....47 5.2.2.3 TRILL Data Frame Outer VLAN Tag...................47 5.2.3 TRILL Ethertype Frames..............................48 5.2.3.1 TRILL IS-IS Frames................................48 5.2.3.2 TRILL Data Frames.................................50 5.2.4 Native Frame Receipt................................51 5.2.5 IGMP and MLD Frames.................................53 5.2.6 PIM and MRD Frames..................................53 5.3 Frames Spontaneously Sourced..........................53 5.3.1 IS-IS Frames Sourced................................53 5.3.1.1 Core IS-IS Frames.................................54 5.3.1.2 Per-VLAN IS-IS Frames.............................55 5.3.2 Other Frames Sourced................................56 6. Incremental Deployment Considerations..................57 6.1 Incremental Deployment................................57 6.2 Wiring Closet Topology................................58 6.2.1 The RBridge Solution................................59 6.2.2 The Spanning Tree Solution..........................60 R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 4] INTERNET-DRAFT RBridge Protocol Table of Contents Continued 6.2.3 The VLAN Solution...................................60 6.2.4 Comparison of Solutions.............................61 7. RBridge Addresses, Parameters, and Constants...........62 8. Security Considerations................................63 9. Assignment Considerations..............................64 9.1 IANA Considerations...................................64 9.2 IEEE 802 Assignment Considerations....................64 10. Normative References..................................65 11. Informative References................................65 Appendix A: Revision History..............................67 Changes from -03 to -04...................................67 Changes from -04 to -05...................................68 Disclaimer................................................70 Additional IPR Provisions.................................70 Authors' Addresses........................................70 Expiration and File Name..................................71 R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 5] INTERNET-DRAFT RBridge Protocol 1. Introduction In traditional IPv4 and IPv6 networks, each subnet has a unique prefix. Therefore, a node in multiple subnets has multiple IP addresses, typically one per interface. This also means that when an interface moves from one subnet to another, it changes its IP address. Administration of IP networks is complicated because IP routers require significant configuration and careful IP address management is required to avoid creating subnets that are sparsely populated and waste addresses. IEEE 802.1 bridges avoid these problems by transparently gluing many physical links into what appears to IP to be a single LAN [802.1D]. Bridge forwarding using the spanning tree protocol has some disadvantages: o The spanning tree protocol blocks ports, limiting the number of forwarding links, and therefore creates bottlenecks by concentrating traffic onto selected links. o The Ethernet header does not contain a hop count (or TTL) field and this is dangerous when there are temporary loops such as when spanning tree messages are lost or components such as repeaters are added. o Forwarding is not pair-wise shortest path, but is instead whatever path remains after the spanning tree eliminates redundant paths. This document presents the design for RBridges (Routing Bridges), which combines the advantages of bridges and routers and which are poetically summarized below. While RBridge technology can be applied to a variety of link protocols, this specification concentrates on IEEE 802.3 links [802.3]. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 6] INTERNET-DRAFT RBridge Protocol 1.1 Algorhyme V2, by Ray Perlner I hope that we shall one day see A graph more lovely than a tree. A graph to boost efficiency While still configuration-free. A network where RBridges can Route packets to their target LAN. The paths they find, to our elation, Are least cost paths to destination! With packet hop count we now see, The network need not be loop-free! RBridges work transparently. Without a common spanning tree. 1.2 Conventions used in this document In general, TRILL refers to the protocol specified herein while RBridge refers to the devices which implement that protocol. The second letter in Rbridge is case insensitive. Both Rbridge and RBridge are correct. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 7] INTERNET-DRAFT RBridge Protocol 2. RBridges The main idea is to have RBridges run a link state protocol amongst themselves. This enables them to have enough information to compute pairwise optimal paths for unicast, and to calculate distribution trees for delivery of frames either to unknown MAC destinations, or to multicast/broadcast groups. ECMP (Equal Cost MultiPath) may be supported, but it may introduce frame reordering. It is also possible, as it is with 802.1 bridges, for re-ordering to occur during changes in network topology. To mitigate temporary loop issues, RBridges forward based on a header with a hop count. Although the hop count discards looping frames, RBridges specify the next hop RBridge when forwarding unicast frames across a shared-media link, which avoids spawning additional copies of frames during a temporary loop. The first RBridge that a frame encounters in a campus, RB1, encapsulates the received frame with a TRILL header that specifies the last RBridge, RB2. RB1 is known as the "ingress RBridge" and RB2 is known as the "egress RBridge". To save room in the TRILL header, a dynamic nickname acquisition protocol is run among the RBridges to select a 2-byte nickname for each RBridge, unique within the campus, which is an abbreviation for the 6-byte IS-IS system ID of the RBridge. The 2-byte nicknames are used to specify the ingress and egress RBridges in the TRILL header. RBridges run the IS-IS election protocol to elect one RBridge per link to be the "Designated RBridge" (DRB). Only the DRB on a link is allowed to act as the ingress RBridge, and encapsulate traffic received on that link, or to act as the egress RBridge and decapsulate traffic received from the campus and forward onto the link. If a link is actually a bridged LAN configured for VLANs, it is possible that the link might be partitioned with respect to some VLANs. The default is to run a single DRB election on a link, with the IS-IS Hellos either with no VLAN tag (the default), or with the VLAN tag specifying the default VLAN for the link. If the RBridge is configured to support a set of k VLANs on the link, then the RBridge runs the IS-IS DRB election up to k times, each instance tagged with one of the VLANs in that set of VLANs depending on its configuration. Therefore there might be multiple DRBs on the link, but at most one on that link per VLAN. By configuration, the DRB for some VLANs may be set by copying the DRB status in the relevant RBridges from a different VLAN rather than by election. RBridges MUST learn the location of end nodes. The DRB on a link learns the location and layer 2 addresses of attached end nodes on R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 8] INTERNET-DRAFT RBridge Protocol that link from the source address of frames, as bridges do (for example, see section 8.7 of [802.1Q]). The DRB learns the layer 2 address of distant end nodes, and the corresponding RBridge to which they are attached, by looking at the ingress RBridge nickname in the TRILL header and the source address in the inner frame header, of TRILL data frames that the DRB is decapsulating onto a link. (See Section 4.6.) Additionally, a per-VLAN instance of IS-IS MAY be used by an RBridge which is DRB on a link to announce some or all of the attached end nodes on that link. The intention is that such an announcement would be used to announce end nodes that have explicitly enrolled, and so such information would be more authoritative than simply learning from data packets being decapsulated onto the link. Also, it can be more secure because not only might the enrollment be cryptographically authenticated, but IS-IS supports cryptographic authentication. But even if a per-VLAN instance is used to announce attached end nodes, RBridges MUST still learn from decapsulating data packets unless configured not to do so. Conflicts are resolved using a confidence level reported with the address in the per-VLAN IS-IS data. (See Section 4.6.) Advertising end nodes using a per-VLAN instance of IS-IS is optional, as is learning from these announcements. 2.1. RBridge Architecture +----------------------------------------------------------+ | Higher Layer Entities | +--+--------------+----------------------+--------------+--+ | \ TRILL Layer | RBridge Relay Entity | TRILL Layer / | +----+------------+----------------------+------------+----+ | Data Link Layer | | Data Link Layer | +-----------------+ +-----------------+ | Physical Layer | | Physical Layer | +-------+---------+ +-------+---------+ | | P1 P2 Figure 1. Architecture of an RBridge Figure 1 shows an RBridge that contains: o An Rbridge Relay Entity that interconnects two Rbridge ports; o At least one port (two in the example); o Higher Layer Entities, including at least the IS-IS protocol. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 9] INTERNET-DRAFT RBridge Protocol o The TRILL Layer. An RBridge encapsulates incoming IEEE 802.3 frames (in this document also referred to as Ethernet frames) with a TRILL header to forward them to other Rbridges. The layer 2 technology used to connect Rbridges may be either IEEE 802.3 or some other technology such as PPP. This is possible since the functionality of an RBridge relay entity is layered on top of the layer 2 technologies. However, in accordance with the TRILL WG charter, the first edition of this document specifies only an IEEE 802.3 encapsulation [802.3]. Figure 2 shows two RBridges RB1 and RB2 interconnected through an Ethernet cloud. There are no restrictions on what may compose the Ethernet cloud: point-to-point or shared media, hubs and 802.1 bridges. The Ethernet cloud may support VLAN tagging or not. ------------ / \ +-----+ / Ethernet \ +-----+ | RB1 |----< >---| RB2 | +-----+ \ Cloud / +-----+ \ / ------------ Figure 2. Interconnected RBridges Figure 3 shows the format of a TRILL frame traveling through the Ethernet cloud from RB1 to RB2. +--------------------------------+ | Outer Ethernet Header | +--------------------------------+ | TRILL Header | +--------------------------------+ | Inner Ethernet Header | +--------------------------------+ | Ethernet Payload | +--------------------------------+ | Ethernet FCS | +--------------------------------+ Figure 3. An Ethernet Encapsulated TRILL Frame In the case of other media different from Ethernet, the outer Ethernet header is replaced by the header specific to that media. For example, Figure 4 shows a TRILL encapsulation over PPP. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 10] INTERNET-DRAFT RBridge Protocol +--------------------------------+ | PPP Header | +--------------------------------+ | TRILL Header | +--------------------------------+ | Inner Ethernet Header | +--------------------------------+ | Ethernet Payload | +--------------------------------+ | Ethernet FCS | +--------------------------------+ Figure 4. A PPP Encapsulated TRILL Frame The outer header is link-specific and, although this document specifies only Ethernet links, other links are allowed. In both cases the Inner Ethernet Header and the Ethernet Payload are derived from the original frame though the frames are encapsulated with a TRILL header as they travel between RBridges for several reasons: 1. to mitigate loop issues a hop count field is included; 2. to prevent original source MAC learning in the core from frames in transit; 3. to direct frames towards the egress RBridge. This enables forwarding tables of RBridges to be sized with the number of RBridges rather than the total number of end nodes. When forwarding unicast frames between RBridges across a shared- media, the outer header contains the address of the next hop Rbridge, to avoid frame duplication. Having the outer header specify the transmitting RBridge as source address ensures that bridges inside the shared-media link will not get confused, as they might given multipathing, if they were to see the original source or ingress RBridge in the outer header. 2.2 RBridges and VLANs A VLAN is a way to partition end nodes into different communities [802.1Q]. The usual method of determining which community a frame belongs to is based on the port from which it is received although end stations can insert this information in a frame. Use of VLANs requires configuration. Rbridges can be configured to provide essentially the same VLAN support as IEEE 802.1Q compliant bridges. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 11] INTERNET-DRAFT RBridge Protocol IEEE 802.1Q bridges have the capability of supporting multiple VLANs over a single link by inserting/removing a VLAN tag into the frame. Some end nodes have the same capability. The VLAN tag is structured according to IEEE 802.1Q. As shown in Figure 3, there are two places where such tags may be present in a TRILL-encapsulated frame which is sent over an IEEE 802.3 link: one in the outer header (outer VLAN) and one in the inner header (inner VLAN). Inner and Outer VLANs are further discussed in Section 4.1. RBridges enforce delivery of a frame originating in a particular inner VLAN only to other links in the same inner VLAN. 2.3 Forwarding of Different Frame Types There are several types of frames which RBridges forward slightly differently. They are here classified into two main categories: known-unicast and multi-destination. 2.3.1 Known-Unicast These frames have an inner MAC Destination Address (Inner.MacDA) that is unicast and the egress RBridge for that destination MAC address location is known to the ingress RBridge. 2.3.2 Multi-destination These are frames that must be delivered to multiple destinations. They are as follows: 1. frames for unknown unicast destinations: the Inner.MacDA is unicast, but the ingress RBridge does not know its location; 2. frames for layer 2 multicast addresses derived from IP multicast addresses: the Inner.MacDA is multicast, from the set of layer 2 multicast addresses derived from IPv4 [RFC1112] or IPv6 [RFC2464] multicast addresses; these frames are handled somewhat differently in different subcases: 2.1 IGMP [RFC3376] and MLD [RFC2710] multicast group membership reports.; 2.2 IGMP [RFC3376] and MLD [RFC2710] queries and MRD [4286] R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 12] INTERNET-DRAFT RBridge Protocol announcement messages; 2.3 other IP derived layer 2 multicast frames; 3. frames for layer 2 multicast addresses not derived from IP multicast addresses: the Inner.MacDA is multicast, and not from the set of layer 2 multicast addresses derived from IPv4 or IPv6 multicast addresses; 4. frames for the layer 2 broadcast addresses: the Inner.MacDA is broadcast. RBridges build distribution trees (see Section 4.3) and use these trees for forwarding multi-destination frames. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 13] INTERNET-DRAFT RBridge Protocol 3. Details of the TRILL Header The section provides a textual and diagrammatic description of the TRILL header. Section 4 below provides other RBridge design details, and Section 5 give pseudo-code. 3.1 TRILL Header Format The TRILL header is shown in Figure 5 and is independent of the data link layer used. When that layer is IEEE 802.3, it is prefixed with the 16-bit TRILL Ethertype and is 64 bit aligned. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | V |M|Op-Length| Hop Count | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Egress RBridge Nickname | Ingress RBridge Nickname | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5. TRILL Headers o V (Version / pruning): 4-bit. See Section 3.2. o M (Multi-destination): 1-bit. See Section 3.3. o Op-Length (Options Length): 5-bit. See Section 3.6. o Hop Count: 6-bit unsigned integer. See Section 3.4. o Egress RBridge Nickname: 16-bit address. See Section 3.5.1. o Ingress RBridge Nickname: 16-bit address. See Section 3.5.2. 3.2 Version / Pruning (V) According to IEEE's Ethertype format guidelines, a single Ethertype is granted to a protocol and it is the protocol's responsibility to structure the format of the protocol header so as to support future revisions to the protocol. In adhering to this guideline, there is a two bit Version field in the TRILL header. Version is zero for TRILL as specified in this document. An RBridge that sees a message with a version value it does not understand MUST silently discard the message because it may not be able to parse it. It is also useful to distinguish TRILL frames that have been analyzed as to optimized tree distribution pruning particularly with regard to IP derived multicast. If the Version is zero, the bottom two bits of R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 14] INTERNET-DRAFT RBridge Protocol V indicate this status as further discussed in Section 4.4.3. The Version is the top or most significant two bits of the V field in Figure 5 while the pruning status is the bottom two bits for Version zero. To avoid having to state two tests when the values of these fields are being checked and to avoid having to state two assignments when the values of these fields are being set, they are concatenated and treated as one field called Variation and referred to by "V" in the rest of this document. In other words, V = ( 4 * Version ) + pruning status. 3.3 Multi-destination (M) The Multi-destination bit (see Section 2.3.2) indicates whether the frame is to be delivered to a single destination station or a class of destination end stations. It specifies the meaning of the egress RBridge nickname field as follows: o M = 0 (FALSE) - the frame is unicast data or core TRILL IS-IS; the egress RBridge nickname contains the nickname of the egress Rbridge for a TRILL unicast data frame and is zero for a core instance TRILL IS-IS frame; o M = 1 (TRUE) - the egress RBridge nickname field contains the nickname of the RBridge that is the root of the distribution tree. This tree is selected by the ingress RBridge for a TRILL data frame or the source RBridge for a per VLAN TRILL IS-IS frame. 3.4 Hop Count A 6-bit unsigned integer. Each RBridge that is about to forward a frame to another RBridge MUST check this field and discard the frame if this field is zero. If this field is non-zero, it MUST be decremented in the forwarded frame. For known unicast frames, the ingress RBridge (or source RBridge for a control frame) MUST set the Hop Count to at least the number of RBridge hops it expects to the egress RBridge and SHOULD set it in excess of that number to allow for alternate routing later in the path. For multi-destination frames, to minimize potential problems with temporary loops when forwarding, the Hop Count SHOULD be set by the ingress RBridge (or source RBridge for a control frame) to the expected number of hops to the most distant RBridge. To accomplish this, RBridge RBn calculates, for each branch of the distribution R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 15] INTERNET-DRAFT RBridge Protocol tree rooted at RBi, the maximum number of hops in that branch. When forwarding a multi-destination frame onto a branch, transit RBridge RBm MAY decrease the hop count by more than 1 to set the hop count to be no more than necessary to reach all destinations in that branch of the RBi tree. Although the RBridge MAY decrease the hop count by more than 1, the RBridge MUST decrease the hop count by at least 1, and discard the packet if the hop count becomes 0. 3.5 RBridge Nicknames Nicknames are 16-bit dynamically assigned abbreviations for each RBridge's 48-bit IS-IS System ID (see Section 4.2.1) to achieve a more compact encoding. This assignment allows specifying up to 64K RBridges. The value zero is reserved to indicate that a nickname is not specified and the value 0xFFFF is reserve for future specification. RBridges piggyback a nickname acquisition protocol on the link state protocol (see Section 3.5.3) to acquire a nickname unique within the campus. 3.5.1 Egress RBridge Nickname There are three cases for the contents of this field, depending on the M-bit (see Section 3.3) and the Inner.MacDA (see Section 4.1). It is filled in by the ingress RBridge for data frames and by the source RBridge for control frames. o For known-unicast data frames M = 0, the Inner.MacDA is not All- RBridge, and the egress RBridge nickname field specifies the egress RBridge i.e. it specifies the RBridge that needs to remove the TRILL header from the data frame. o For multi-destination data frames, M = 1, and the egress RBridge nickname field contains the nickname of the root RBridge of the distribution tree selected to be used to forward the frame. The root MUST NOT be changed by transit RBridges. o For core instance TRILL IS-IS frames M = 0, Inner.MacDA == All- Rbridge, and egress RBridge nickname field is not used. Such frames may be sent before nicknames have been established and are only sent one hop. The Egress RBridge Nickname MUST be set to zero by the source RBridge for such frames and is ignored by other RBridges. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 16] INTERNET-DRAFT RBridge Protocol 3.5.2 Ingress RBridge Nickname The ingress RBridge nickname contains the nickname of the ingress RBridge for data frames (Inner.MacDA != All-Rbridges) and for per VLAN TRILL IS-IS frames (Inner.MacDA == All-Rbridges and an VLAN tag is present). For core TRILL IS-IS frames (Inner.MacDA == All-Rbridges and no inner VLAN tag is present) this field is not used and MUST be set to zero by the source RBridge for the control frame and ignored by other RBridges. 3.5.3 RBridge Nickname Allocation The nickname allocation protocol is piggybacked on the core TRILL IS- IS instance as follows: o The nickname being used by an RBridge is carried in an IS-IS TLV (type-length-value data element) along with a priority of use value. Each RBridge chooses its own nickname. o The nickname value MAY be configured. An RBridge that has been configured with a nickname value will have priority for that nickname value over all Rbridges with non-configured nicknames. o The nickname values zero and 0xFFFF are reserved and may not be selected or configured. o The priority of use field reported with a nickname is an unsigned 8-bit value, where the most significant bit (0x80) indicates that the nickname value was configured. The bottom 7 bits have the default value 0x40, but MAY be configured to be some other value. Additionally, an RBridge MAY increase the priority (once) after holding the nickname for some amount of time, to prevent a newly arriving RBridge that has not yet seen all the LSPs, from usurping its nickname, unless the new RBridge has been configured with the nickname value and the RBridge using that nickname value was not manually configured with that nickname value. The most significant bit of the priority MUST NOT be set unless the nickname value was configured. o Each RBridge is also responsible for ensuring that its nickname is unique. If RB1 chooses nickname x, and RB1 discovers, through receipt of RB2's LSP, that RB2 has also chosen x, then the RBridge with the numerically higher priority keeps the nickname, or if there is a tie with priority, the RBridge with the numerically higher System ID keeps the nickname, and the other RBridge MUST choose a new nickname. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 17] INTERNET-DRAFT RBridge Protocol o If two RBridge campuses merge then transient nickname collisions are possible. As soon as each RBridge receives the link state frames from the other RBridges, the RBridges that need to change nicknames choose new nicknames that do not, to the best of their knowledge, collide with any existing nicknames. To minimize the probability of nickname collisions, each RBridge chooses its nickname by randomly hashing some of its parameters. There is no reason for all Rbridges to use the same algorithm for choosing nicknames. Once an RBridge has successfully acquired a nickname it SHOULD store it in non-volatile memory and attempt to reuse it in the case of a reboot. To minimize the probability of a new RBridge usurping a nickname already in use, an RBridge SHOULD wait to acquire the link state database from a neighbor before it announces its own nickname. In IS-IS [ISO10589] a shared link is modeled as a pseudonode. Pseudonodes never act as ingress or egress RBridges and are never treated as distribution tree roots. Thus they do not need and do not have nicknames. 3.6 TRILL Header Options The TRILL Protocol includes an option capability in the TRILL Header. The Op-Length header field gives the length of the options in units of 4 bytes which allows up to 124 bytes of options area. If Op- Length is zero there are no options present; else, the options follow immediately after the Ingress Rbridge Nickname field. All Rbridges MUST be able to skip the number of 4-byte chunks indicated by the Op-Length field in order to find the inner frame, since RBridges must be able to find the destination MAC destination address and VLAN tag in the inner frame. (Transit RBridges need such information to filter IP multicast and VLANs, etc. Egress Rbridges need to find the inner frame to correctly decapsulate and dispose of the inner frame.) All transit Rbridges that do not implement any options MUST transparently copy the options area in frames they forward. Options will be further specified in later documents and are expected to include provisions for hop-by-hop and ingress-to-egress options as well as critical and non-critical options. A critical option is one which must be understood to safely process a frame. A non-critical options can be safely ignored. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 18] INTERNET-DRAFT RBridge Protocol Warning: Most RBridges are expected to be implemented to optimize the simplest and most common cases of frame forwarding and processing. The inclusion of any options may, and the inclusion of complex or lengthy options almost certainly will, cause frame processing using a "slow path" with markedly inferior performance to "fast path" processing. Limited slow path throughput may cause some of such frames to be lost. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 19] INTERNET-DRAFT RBridge Protocol 4. Other RBridge Design Details Section 3 above describes the TRILL Headers while this Section provides a textual and diagrammatic description of other RBridge design details. Section 5 below provides pseudo-code. 4.1 Ethernet Data Encapsulation TRILL data frames in transit on Ethernet links are encapsulated with an outer Ethernet header (see Figure 3). This outer header looks, to a bridge on the path between two RBridges, like the header of a regular Ethernet frame and therefore bridges forward the frame without requiring any modification. To enable RBridges to distinguish TRILL frames, a new Ethertype = TRILL (to be assigned) is used in the outer header. Figure 7 details a data frame with an outer VLAN tag traveling on the Ethernet cloud of Figure 2 from RB1 to RB2. This encapsulation has the advantage, in the absence of TRILL options, of aligning the original Ethernet frame at a 64 bit boundary. When a TRILL data frame is carried over an Ethernet cloud it has three pairs of addresses: o Outer Ethernet Header: Outer Destination MAC Address and Outer Source MAC Address: These addresses are used to specify the next hop RBridge, and the transmitting RBridge, respectively, over a shared Ethernet cloud. o TRILL Header: Egress (RB2) Nickname and Ingress (RB1) Nickname. These specify the nickname values of the egress and ingress RBridges, respectively, for data frames. o Inner Ethernet Header: Inner Destination MAC Address and Inner Source MAC Address: These addresses are as transmitted by the original end node, specifying, respectively, the destination and source of the inner frame. It also potentially has two VLAN tags that can carry two different VLAN Identifiers and also include priority. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 20] INTERNET-DRAFT RBridge Protocol Outer Ethernet Header: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Outer Destination MAC Address (RB2) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Outer Destination MAC Address | Outer Source MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Outer Source MAC Address (RB1) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ethertype = IEEE 802.1Q | Outer.VLAN Tag Information | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ TRILL Header: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ethertype = TRILL | V |M|Op-Length| Hop Count | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Egress (RB2) Nickname | Ingress (RB1) Nickname | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Inner Ethernet Header: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Inner Destination MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Inner Destination MAC Address | Inner Source MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Inner Source MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ethertype = IEEE 802.1Q | Inner.VLAN Tag Information | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Payload: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Original Ethernet Payload | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Frame CheckSum: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | New FCS (Frame CheckSum) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 7. TRILL Data Encapsulation over Ethernet 4.1.1 VLAN Tag Information The information in a "VLAN Tag", also known as a "Q-tag", is more than just a VLAN ID. It always includes a priority field as shown in Figure 8. In fact, the "VLAN ID" may be zero, indicating the no VLAN is specified, just priority, although such a tag is properly called a "priority tag" rather than a "VLAN Tag" [802.1Q]. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 21] INTERNET-DRAFT RBridge Protocol +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | Priority | C | VID (VLAN ID) | +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ Figure 8. VLAN Q-Tag Information As recommended in [802.1Q] Rbridges SHOULD be implemented so as to allow use of the full range of VIDs from 1 through 0xFFE. VID zero is the null VLAN identifier and indicates that no VLAN is specified while VID 0xFFF is reserved. Rbridges MAY support a smaller number of simultaneously active VLAN IDs than the total number of different VLAN IDs they allow. The "C" bit shown in Figure 8 is the CFI or Canonical Format Indicator bit. It refers to the format of the associated source and destination addresses. The CFI is not used with IEEE 802.3. In TRILL, it MUST be set to zero and is ignored by receivers. As specified in [802.1Q], the priority field contains an unsigned value from 0 through 7 where 1 indicates the lowest priority, 7 the highest priority, and the default priority zero is considered to be higher than priority 1 but lower than priority 2. Devices, including RBridges, are not required to implement 8 priority levels so frames with different priority levels may be treated as if they had the same priority. Differing priorities can cause frame re-ordering. The Q-Tag Ethertype is 0x8100. 4.1.2 Outer VLAN Info The "Outer VLAN Info" field carries the outer VLAN tag and may or may not be present. If present, it specifies a priority and may be required to specify a VLAN to enable connectivity between two RBridges through an Ethernet cloud that supports VLANs. Once two RBridges have established connectivity on an outer VLAN, they become adjacent and they start to operate as if connected by a direct link. For example, a network manager may configure VLAN 4 for RBridges RB1 and RB2 to communicate (the outer VID contains the value 4). VLAN 3 may be assigned for RB2 and RB3 to communicate (the outer VID contains the value 3). In this case RB2 becomes adjacent to both RB1 (on VLAN 4) and RB3 (on VLAN 3), but RB1 and RB3 are not adjacent (since they have no common VLAN). The Designated RBridge election (see Section 4.2.4) can be run by RB1 on a given port multiple times, up to once for each VLAN that RB1 is configured to support on that port. RB1 MAY be separately configured with a Designated RBridge priority for each VLAN/port pair that it R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 22] INTERNET-DRAFT RBridge Protocol supports and may see different adjacencies for different VLANs. Therefore, RB1 may get elected DRB for some VLANs on a port, and not for others. The IS-IS Hello messages on a port MUST be transmitted with the VLAN ID in the "Outer.VLAN Info" field set to the VLAN for which the election is being run. The priority field in the Outer VLAN Info is set on an outgoing TRILL frame to a copy of the priority field in the Inner VLAN Info for data frames or to 7, the highest priority, for TRILL IS-IS frames. 4.1.3 Inner VLAN Info The "Inner VLAN Info" field contains the VLAN information associated with the original native frame when it was ingressed or the VLAN information associated with a per VLAN IS-IS message when that message was created. When a TRILL frame with Inner VLAN Info arrives, that Inner VLAN Info is not changed. When a native (non-TRILL) frame arrives, the priority and VLAN in the Inner VLAN Info are determined as specified in [802.1Q] (see [802.1Q] Section 6.7). A high level informative summary of how this VLAN Info is determined, omitting some details, is given in the bulleted items below: o When an untagged native frame arrives, a zero configuration RBridges associates the default priority, zero, and the VLAN ID 1 with it. It actually sets the VLAN for the untagged frame to be the "port VLAN ID" associated with that port. The port VLAN ID (PVID) defaults to VLAN ID 1 but may be configured to be any other VLAN ID. An Rbridge may also be configured on a per port basis to discard such frames or to associate a different priority with them. Determination of the configured port VLAN IDs may also be made dependent on the Ethertype or NSAP (referred to in 802.1 as the Protocol) of the arriving frame. o When a priority tagged native frame arrives, a zero configuration RBridge associates the port VLAN ID, which defaults to 1, and the priority provided in the frame with it. An Rbridge may be configured on a per port basis to discard such frames or to associate them with a different VLAN ID as described in the point above. It may also be configured to map the priority provided in the frame by specifying, for each of the eight possible priorities that might be frame, what actual priority will be associated with the frame by the RBridge. o When a Q-tagged native frame arrives, a zero configuration RBridge associates with it the VID and priority in the Q-tag. An RBridge may be configured on a per port per VLAN basis to discard such R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 23] INTERNET-DRAFT RBridge Protocol frames. It may also be configured on a per port basis to map the priority as specified above for priority tagged frames. In 802.1, the process of assigning a priority to a frame including mapping a priority provided in the frame to another priority, is referred to as priority "regeneration". Thus, in TRILL, the Inner VLAN Tag always specifies a VLAN ID. This Inner VLAN ID is required at every ingress Rbridge as one element in determining the appropriate egress Rbridge for a known unicast frame and is required at the ingress and every transit Rbridge for multi- destination frames to correctly prune the distribution tree. Note that the VLAN ID 0xFFF is reserved and MUST NOT be used. Rbridges MUST discard any frame they receive which is tagged as being in VLAN 0xFFF. 4.1.4 Frame CheckSum (FCS) Each frame has a single Frame CheckSum (FCS) that is computed to cover the entire frame. It is calculated before transmission and checked on receipt. Any frame for which the FCS fails is discarded. The FCS is generally recalculated on every hop due to changes such as the decrementing of the hop count. 4.2 Link State Protocol (IS-IS) TRILL uses IS-IS as the routing protocol, since it has the following advantages: o it runs directly over layer 2, so therefore may be run with zero configuration (no IP addresses need to be assigned); o it is easy to extend by defining new TLV (type-length-value) encoded data elements for carrying TRILL information; IS-IS has three types of packets; LSPs (Link State PDUs), Hellos (for finding neighbors and running the Designated RBridge election protocol), and SNPs (sequence numbers packets, for acknowledging one or more LSPs). R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 24] INTERNET-DRAFT RBridge Protocol 4.2.1 IS-IS RBridge Identity Each RBridge has a unique 6-byte IS-IS System ID, which may be derived from any of the RBridge's unique MAC addresses. 4.2.2 Distinguishing IS-IS Instances TRILL implements separate IS-IS instances from the one used by layer 3, that is, different from the one used by IP routers. TRILL IS-IS messages are distinguished from layer 3 IS-IS messages because TRILL IS-IS frames have a TRILL header and use a distinct, constant Area Address that would never appear as a real layer 3 IS-IS area address. This Area Address is the value zero. All TRILL IS-IS frames have the Inner.MacDA == All-Rbridges. Within TRILL, there is a mandatory core IS-IS across all Rbridges in the campus and optional per VLAN instances between the RBridges on each supported VLAN. They are distinguished by the presence of an inner VLAN tag in the per VLAN instance frames and the absence of such a tag in the core instances frames. All Rbridges must participate in the core IS-IS instance. Core IS-IS instance frames are never forwarded by an RBridge but are decapsulated and locally processed. (Such processing may cause the RBridge to emit additional core IS-IS instance frames.) RBridges that are the Designated RBridge for a link having an end station in a particular VLAN MAY participate in the per VLAN IS-IS instance for that VLAN. But all transit RBridges MUST properly forward per VLAN IS-IS instance frames. Because of this forwarding, it appears to a per VLAN IS-IS instance at an RBridge that it is directly connected by a shared link to all other RBridges in the campus running that per VLAN IS-IS instance. Egress RBridges that do not implement the per VLAN IS-IS instance for that VLAN do not decapsulate or locally process any per VLAN IS-IS frames they receive. 4.2.3 TRILL IS-IS Information The information in the IS-IS link state for the mandatory core and optional per-VLAN TRILL IS-IS instances is listed below. The actually encoding of this information and the IS-IS Type values for any new IS-IS TLV data elements will be specified in a separate document. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 25] INTERNET-DRAFT RBridge Protocol 4.2.3.1 Core IS-IS Information The information contained in the LSP of RBridge RBn for the mandatory core IS-IS instance is as follows: 1. The IS-IS System IDs of RBridges which are neighbors of RBridge RBn, and the cost of the link to each of those neighbors. 2. The nickname of RBridge RBn (2 bytes) and the unsigned 8-bit priority for RBn to have that nickname (see Section 3.5.3); 3. The TRILL Header variations supported by RBridge RBn (16 bits). 4. A flag RequestTree indicating whether RBridges MUST calculate a tree rooted at RBn (default RequestTree = TRUE). 5. The list of RBridge nicknames that RBn might select for a distribution tree when RBn injects a multi-destination frame into the campus. The purpose of this field is so that RBridges can efficiently build receipt filters to avoid multicast loops (see Section 4,3,1). 6. The list of VLAN IDs of VLANs directly connected to RBn for links on which RBn is DRB. (Note: an RBridge may advertise that it is connected to additional VLANs in order to receive additional information to support certain VLAN based features beyond the scope of this specification as discussed in Section 4.7.) In addition, the LSP contains the following information on a per VLAN basis. 6.1 Multicast Router attached: This is two bits of information per VLAN that indicate whether there is an IPv4 and/or IPv6 multicast router attached to the Rbridge on that VLAN. An RBridge which does not do IP multicast control snooping MUST set both of these bits (see Section 4.4.3). This information is used because IGMP [RFC3376] and MLD [RFC2710] Membership Reports MUST be transmitted to all links with IP multicast routers, and SHOULD NOT be transmitted to links without such routers. Also, all frames for IP-derived multicast addresses MUST be transmitted to all links with IP multicast routers (within a VLAN), in addition to links from which an IP node has explicitly asked to join the group the frame is for. 6.2 Optionally, Layer 2 addresses derived from IPv4 IGMP or IPv6 MLD notification messages received from attached end nodes on each VLAN, indicating the location of listeners for these multicast addresses (see Section 4.4.3) 6.3 Optionally, RBn MAY announce the set of IDs of Root bridges for links for which RBn is DRB for that VLAN. This is to R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 26] INTERNET-DRAFT RBridge Protocol quickly detect cases where two layer 2 clouds accidentally get merged, and where there might otherwise temporarily be two DRBs for the same VLAN on the same link. (See Section 4.2.4.) 7. Optionally, the list of VLAN groups, where each VLAN group is a list of VLAN IDs, with the first VLAN ID listed in a group is the "primary" and the others are "secondary". This is solely to detect misconfiguration of features outside the scope of this document. RBridges that do not support features such as "shared VLAN learning" ignore this field (see Section 4.7). Using this information each RBridge can compute the optimal pair-wise forwarding for known-unicast traffic (the Forwarding Database) and the distribution trees for multi-destination traffic. The distribution of multi-destination frames (see Sections 4.3 and 4.4.3) SHOULD also be pruned according to the list of VLAN IDs connected to each RBridge and for IP based multicast optimization (see Section 4.3.2). If RBn is forwarding a multi-destination frame tagged with VLAN A, RBn SHOULD NOT forward it onto branches of the distribution tree that have no downstream VLAN A links. 4.2.3.2 Optional Per-VLAN IS-IS Instance Information The information in the LSP for the optional per VLAN TRILL IS-IS instances is the list of local end station MAC addresses known to the originating RBridge and for each such address a one byte unsigned "confidence" rating in the range 0-254 (see Section 4.6). 4.2.4 Designated RBridge IS-IS elects one RBridge for each link / VLAN pair to be the Designated RBridge (DRB), i.e. to have special duties. The Designated RBridge: o learns and may advertise the identities of attached end nodes; o encapsulates and forwards frames that originate on that link to the rest of the campus; o decapsulates and forwards frames received from other RBridges onto that link; o learns and caches (ingress RBridge, source MAC address) from frames it is decapsulating onto the link; R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 27] INTERNET-DRAFT RBridge Protocol It is incorrect to have multiple RBridges being Designated RBridge on the same link at the same time, unless they are Designated for different VLANs. Multiple DRBs could temporarily happen if a partitioned bridged LAN became connected with a bridge or repeater. The situation resolves once the better priority RBridge's IS-IS Hello is received by the other RBridges on the link. However, it is desirable to have this situation resolve as quickly as possible, because if there are multiple DRBs, RB1 and RB2, RB1 might encapsulate and forward into the campus frames that R2B forwarded onto that link from the campus. BPDUs (Bridge Protocol Data Units) are messages that are transmitted and received even in preforwarding state (listening and learning states). If RBridges listen to BPDUs, and if the LANs for which RB1 was Designated RBridge, and for which RB2 was Designated RBridge get joined, then either RB1 or RB2 can detect that the bridge Root has changed identity. A conservative solution would be to invoke something like a preforwarding state, in which the RBridge that detects that the identity of the root bridge has changed stops forwarding native frames to or from the link until it is sure the IS-IS link election would have completed. But the IS-IS election could get slowed down due to bridges in preforwarding state, and it would be undesirable to disrupt traffic to and from the link just because the root ID has changed. An alternative solution is to have RBridges participate in the spanning tree election, with higher priority for becoming root (actually, lowest numerical priority value) than any of the 802.1 bridges, and with the same priority as for becoming Designated RBridge on the link. Then an RBridge is Designated RBridge if and only if it is the spanning tree Root. Note that RBridges MUST NOT merge spanning trees from different ports. If two ports of RB1, p1, and p2, are connected to the same bridged LAN, RB1 will receive the IS-IS Hello message it transmitted on p1 on p2, and likewise, when it transmits an IS-IS Hello message on p2, it will receive it on p1. The IS-IS Hello must contain a port identifier, unique for each of RB1's ports, and if RB1 receives its own Hellos on a different port, then RB1 becomes DRB for at most one of those (connected) ports. So for example, RB1 sends BPDUs on each of its ports, with itself as Root (with highest, i.e., numerically lowest priority), 0 cost from Root, and the port ID. There are several possible cases: o RB1 is the highest priority RBridge on the bridged LAN, in which case it becomes spanning tree Root and Designated RBridge. o RB1 receives a BPDU from itself (because two of its ports are on the same shared medium without any bridges between). In this case, R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 28] INTERNET-DRAFT RBridge Protocol the numerically lowest port remains in spanning tree forwarding state, and the other port(s) go into spanning tree blocking state. o RB1 receives a BPDU from someone else with higher priority (numerically lower priority|ID), in which case RB1 is not Root, and not Designated RBridge. It is possible this is due to a bridge being configured with the lowest priority, and then if RB1 declines being Designated RBridge, the LAN becomes orphaned from the campus. We must treat this case as a misconfiguration of bridges, and the LAN becomes orphaned until the misconfiguration is corrected, but an RBridge could in theory eventually discover it is not receiving any IS-IS Hellos, and become Designated RBridge even though it is not spanning tree Root. RBridges MAY participate in the bridge spanning tree protocol as described above, and become Designated RBridge if and only if they are spanning tree root. If an RBridge RB1 does not participate in the bridge spanning tree then it SHOULD listen to bridge spanning tree messages, and if the root bridge ID changes from B1 to B2 on VLAN X, then RB1 SHOULD look at link state packets from other RBridges to see if any other RBridges report connectivity to VLAN X, bridge B2. If this is the case, then RB1 SHOULD delay before forwarding traffic to or from the link with new root bridge B2 until the IS-IS Designated Router election protocol has a chance to complete. If an RBridge participates in spanning tree, a port MUST NOT block TRILL Ethertype frames from being received or transmitted when it is in spanning tree blocked state, although this will stop the receipt and transmission of native (non-TRILL) data frames. 4.3 Distribution Trees RBridges use distribution trees to forward multi-destination frames (see Section 2.3.2). Distribution Trees are bidirectional. A single distribution tree is logically enough for the entire campus. The TRILL WG decided that the computation of additional distribution trees was warranted because: 1. using a tree rooted at the ingress RBridge optimizes the distribution path and (almost always) the cost of delivery when the number of destination links is a subset of the total number of links, as is the case with VLANs and IP multicasts; 2. for unknown unicast destinations, using a tree rooted at the ingress RBridge minimizes out-of-order delivery because, in the case where a flow starts before the location of the destination is R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 29] INTERNET-DRAFT RBridge Protocol known by the RBridges, the path to the destination is the same as the shortest path to the destination. A distribution tree rooted in the ingress RBridge is not always the best choice: 1. In some cases, a different tradeoff might be wanted in terms of the expense of computing many trees vs. optimality of traffic distribution (so fewer trees would be desired). 2. It might be desirable to allow choosing a different distribution tree than the one rooted at the ingress RBridge for some frames in order to allow multipathing of multicast traffic injected by a particular RBridge. RBridges MUST calculate at least one distribution tree, and by default SHOULD compute one distribution tree for every Rbridge. However, to scale in the presence of a large number of RBridges in a campus, some RBridges MAY be configured to not be the root of a distribution tree. Each RBridge RBi announces whether RBridges MUST compute a tree rooted at RBi via the RequestTree flag in its IS-IS instance LSP. The default is RequestTree == TRUE, but management configuration MAY reduce the number of trees. If all Rbridges have their RequestTree == FALSE, then each RBridge MUST calculate a tree rooted at the RBridge with lowest ID. If RBi is a tree root, then any RBridge RBn that needs to send multi- destination traffic MAY select the RBi-tree by specifying RBi as the egress Nickname in the TRILL header. However, RBn MUST announce, in its LSP, an intention to use RBi as a tree root if RBn ever chooses the RBi-tree. All the other RBridges MUST comply with the decision of the RBridge RBn. In IS-IS a shared link is modeled as a pseudonode. The RBridge acting as designed RBridge for a shared link MUST set RequestTree = FALSE in the pseudonode LSP. 4.3.1 Distribution Tree Calculation and Checks RBridges do not use the spanning tree protocol to calculate distribution trees. Instead, distribution trees are calculated based on the link state information, selecting a particular RBridge as the root. Calculation of a tree rooted at RBi is done independently by each RBridge RBn by performing the SPF (Shortest Path First) calculation with RBi as the root without requiring any additional exchange of R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 30] INTERNET-DRAFT RBridge Protocol information. When a node RBn has two or more minimal equal cost paths toward the Root RBi a deterministic tie-breaker is needed to guarantee that all Rbridges calculate the same distribution tree. This is obtained by selecting the path that goes to the parent that has the lower IS-IS System ID. Each RBridge RBn keeps a set of adjacencies (port, neighbor pair) for each distribution tree. One of these adjacencies is toward the root RBi and the others are toward the leaves. Once the adjacencies are chosen, it is irrelevant which ones are towards the root RBi, and which are away from RBi. Let's suppose that RBn has calculated that adjacencies a, c, and f are in the RBi tree. A multi-destination frame for the distribution tree RBi is received only from one of the adjacencies a, c, or f (otherwise is discarded) and forwarded to the other two adjacencies. To further avoid temporary multicast loops during topology changes, RBridges MUST do a sanity check that a multi-destination frame arrives on the expected link. This call the Reverse Path Forwarding Check and is done as follows. When RBn calculates the RBi tree, for each adjacency in the RBi tree, RBn lists the possible ingress RBridge nicknames on that adjacency. The only ingress RBridges that appear on any of the adjacencies are RBridges that have explicitly stated, in their LSP, that they may select RBi as a distribution tree. If a multi-destination frame is received on a particular adjacency, marked as the RBi-tree, then RBn MUST NOT forward it if the ingress RBridge is not listed in the allowed list of ingress RBridges for that adjacency for that tree. 4.3.2 Pruning the Distribution Tree Each distribution tree SHOULD be pruned per VLAN eliminating branches that have no potential receivers downstream. Multi-destination frames SHOULD only be forwarded on branches that are not pruned. Further pruning SHOULD be done in the case of IGMP [RFC3376], MLD [RFC2710], and MRD [RFC4286] messages, where these are to be delivered only to ports with IP Multicast routers. In the case of a multicast derived from an IP multicast, these multicast data frames are delivered only to links that have registered listeners, plus links which have IP Multicast routers. Let's assume that RBridge RBn knows that adjacencies (a, c, and f) are in the RBi-distribution tree. RBn marks pruning information for each of the adjacencies in the RBi-tree. For each adjacency and for each tree, RBn marks: R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 31] INTERNET-DRAFT RBridge Protocol o the set of VLANs reachable downstream, and for each one of those, a flag indicating whether there are IPv4 or IPv6 multicast routers downstream, and o the set of layer 2 multicast addresses derived from IP multicast groups for which there are receivers downstream. 4.3.3 Forwarding Using a Distribution Tree Forwarding a multi-destination data frame is done as follows: o The RBridge RBn receives a multi-destination frame with inner VLAN A and the TRILL header indicates the selected tree is the RBi- tree; o if the adjacency from which the frame was received is not one of the adjacencies in the RBi-tree for the specified ingress RBridge, the frame is dropped (see Section 4.3.1); o else if the frame is an IGMP or MLD announcement message or and MRD query message then the frame is forwarded onto adjacencies in the RBi-tree that indicate there are downstream VLAN A IPv4 or IPv6 multicast routers respectively (for more information see Section 4.4); o else if the frame is for a layer 2 multicast address derived from an IP multicast group then the frame is forwarded onto adjacencies in the RBi-tree that indicate there are downstream VLAN A IP multicast routers, as well as adjacencies that indicate there are downstream VLAN A receivers for that group address (see Section 4.4); o else (the inner frame is for an unknown destination or layer 2 multicast not derived from IP multicast or broadcast) the frame is forwarded onto an adjacency if and only if that adjacency is in the RBi-tree, and marked as reaching VLAN A links. For each link for which RBn is Designated RBridge, RBn additionally checks to see if it should decapsulate the frame and send it to the link, or process the frame. The per-VLAN instance of IS-IS frames will be delivered only to RBridges which are Designated RBridges for that VLAN. Per-VLAN TRILL IS-IS messages look, to transit RBridges, like any multicast data packet tagged with an inner VLAN tag. Such packets will be multicast throughout the campus, like any other multicast data packets, on the distribution tree chose by the RBridge which injected the per-VLAN IS-IS message, and pruned according to the inner VLAN tag so that it R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 32] INTERNET-DRAFT RBridge Protocol is received by all the RBridges who are DRB for a link in that VLAN. 4.4 Forwarding Behavior This section describes RBridge behavior for a variety of received frames, including how they are forwarded when appropriate. 4.4.1 Receipt of a Native Frame An RBridge can tell that it has received a native frame because it does not have a TRILL Ethertype. The ingress Rbridge RB1 determines the VLAN ID according to the same rules as 802.1 bridges do (see Section 4.1.3). Once the VLAN is established, if RB1 is not the Designated RBridge (DRB) for the link from which the frame was received for that VLAN, it is discarded. If it is DRB, then it is forwarded according to 4.4.1.1 if the frame is unicast, and 4.4.1.2 if it is multicast or broadcast. 4.4.1.1 Native Unicast Case If the destination MAC address of the native frame is a unicast address, the following steps are performed. The layer 2 destination address D is looked up in the Encapsulation Database for that VLAN to find the egress RBridge RBm, or discover that D is unknown. If D is known, with egress RBm, then RB1 converts the native frame to a TRILL data frame with outer MAC addresses from RB1 unicast to the next hop RBridge towards RBm and a TRILL header with V = 0 and M = 0, the ingress nickname for itself, and the egress nickname for RBm. If D is unknown, RB1 converts the native frame to a TRILL data frame with outer MAC addresses of RB1 as source and the All-Rbridges multicast address as destination and a TRILL header with the variation field V = 1 (indicating that VLAN pruning is known to be the only pruning appropriate during tree distribution), the multi- destination bit M = 1, the ingress nickname for itself, and the egress nickname for the root of the distribution tree it wants to use. The default is for RB1 to write its own nickname into the egress nickname field. However, RB1 MAY choose a different distribution tree if either RB1 has not elected to be a tree root, or if RB1 has been configured to path-split multicast. In that case RB1 R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 33] INTERNET-DRAFT RBridge Protocol MUST select a tree by specifying an RBridge that has elected to be a tree root. Also, RB1 MUST select a tree that RB1 has announced (in RB1's own LSP) to be one of the ones that RB1 MAY choose as a distribution tree. (see Section 4.3.1) 4.4.1.2 Native Multicast and Broadcast Frames If the destination address of a native frame is the broadcast address or a multicast address other than All-Rbridges, the frame is processed as described below. A native (non-TRILL) frame sent to the All-Rbridges address is erroneous and is discarded. If the frame is an IGMP [RFC3376], MLD [RFC2710], or MRD [RFC4286] message, then RB1 SHOULD analyze the frame, learn any group membership or IP multicast router presence indicated, and announce that information for the appropriate VLAN in its IS-IS link state (see Section 4.5). For all such frames, RB1 also chooses a distribution tree, encapsulates, and forwards the frame on the pruned distribution tree. In the encapsulation, M = 1, V is set to 1 if the Inner.MacDA is not an IP derived multicast address and to the appropriate value (see section 4.4.2.2.2) if it is an IP derived multicast address, the Outer.MacSA is set to that of the port on which the frame is being transmitted and the Outer.MacDA is normally the All-Rbridges multicast address; however, if for any particular port there is only one next hop RBridge, the frame MAY be sent with the unicast Outer.MacDA of the target RBridge. Using a unicast Outer.MacDA is of no benefit on a point-to-point link but may result in substantial savings if the link is actually a complex bridged LAN. 4.4.2 Receipt of a Non-Native (TRILL) Frame Non-native frames are indicated by a TRILL outer Ethertype. Such frames will be received with an Outer.MacDA that is unicast or that is the All-RBridges multicast address. TRILL frames with any other Outer.MacDA are erroneous and are discarded except that a TRILL frame with the broadcast Outer.MacDA MAY be treated as if the Outer.MacDA was the All-Rbridges multicast address. TRILL frames received by an RBridge on a port are processed regardless of that RBridge's DRB status for that port. If the Outer.MacDA is a unicast address, the frame is discarded unless that address is the address of the receiving Rbridge. (Such discarded frames are most likely addressed to another RBridge on a multi-access link and that other Rbridge will handle them.) After R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 34] INTERNET-DRAFT RBridge Protocol this check, further processing of TRILL frames is independent of the Outer.MacDA. If the V field in the TRILL Header is greater than 3, the frame is discarded. The Inner.MacDA is then tested. If it is the All-Rbridges multicast address, processing proceeds as in Section 4.4.2.1 below. If it is any other address, processing proceeds as in Section 4.4.2.2. 4.4.2.1 TRILL IS-IS Frames If there is no Inner VLAN tag, it is a core instance TRILL IS-IS frame and is processed by the core IS-IS instance on RBn and is not forwarded. Note that in this instance, nicknames may not yet have been established and the ingress and egress nickname fields are ignored. If there is an Inner VLAN tag, it is a per VLAN instance TRILL IS-IS frame. If M == 0 or V != 1, the frame is discarded. The egress nickname will designate an appropriate distribution tree. in this case, the frame is forwarded as described in Section 4.4.2.2.2. In addition, if the forwarding Rbridge is a DRB for a link in the specified VLAN, the inner frame is decapsulated and provided to the local per VLAN IS-IS instance for that VLAN. 4.4.2.2 TRILL Data Frames The port on which the frame was received is first checked and the frame discarded unless there is an IS-IS adjacency on that port. The Inner.MacDA is then checked. If it is unicast, processing continues as described in Section 4.4.2.2.1, otherwise processing continues as described in Section 4.4.2.2.2. 4.4.2.2.1 Unicast TRILL Data Frames If M == 1 the frame is discarded. Generally, the hop count is decremented by one and the frame forwarded to the next hop RBridge towards the egress RBridge, using the Forwarding Database, unless the hop count was reduced to zero, in which case the frame is discarded. On the other hand, if the egress RBridge indicated is the RBridge R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 35] INTERNET-DRAFT RBridge Protocol performing the processing (RBn), the frame being forwarded is reconverted to native form. This frame is then either sent onto the link containing the destination or locally processed if the RBridge itself is the destination. 4.4.2.2.2 Multi-Destination TRILL Data Frames If M == 0, the frame is discarded. The Outer.MacSA is then checked and the frame discarded if it is not a tree adjacency for the tree indicated by the egress RBridge nickname or the RPF check fails (see Section 4.3.1). The frame is then forwarded down the tree specified by the egress RBridge nickname pruned as follows: V = 3, the tree SHOULD be pruned on VLAN and to branches with downstream IPv4 multicast routers if the Inner.MacDA is IPv4 derived multicast or downstream IPv6 multicast routers if the Inner.MacDA is IPv6 derived multicast. V = 2, the tree SHOULD be pruned on VLAN and to branches with downstream IPv4 multicast routers or with IPv4 multicast listeners from which the Inner.MacDA would be derived and similarly for iPv6. V = 1, the tree SHOULD be pruned on VLAN only. It is either broadcast, a non-IP derived multicast, or an IP derived multicast derived from an IP address for which multicast group membership reports are not issued (see Section 4.4.3). V = 0, the tree SHOULD be pruned on VLAN but the native frame has not been fully analyzed from the point of view of multicast optimization. The processing RBridge SHOULD complete this analysis, set V to some value from 1 through 3, and use that pruning. However, if it chooses not to do this analysis, it can it can either do no multicast optimization or do a more limited optimization, for example based only on the Inner.MacDA. In the forwarded frame, the Outer.MacSA is set to that of the port on which the frame is being transmitted and the Outer.MacDA is normally the All-Rbridges multicast address; however, if for any particular port there is only one next hop RBridge, the frame MAY be sent with a unicast Outer.MacDA. Using a unicast Outer.MacDA is of no benefit on a point-to-point link but may result in substantial savings if the link is actually a complex bridged LAN. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 36] INTERNET-DRAFT RBridge Protocol 4.4.3 Tree Distribution Optimization RBridges MUST determine the VLAN associated with all native frames and properly enforce VLAN rules on the emission of native frames at egress RBridges according to how they are configured. They SHOULD also prune the distribtion tree of multi-destination frames according to VLAN. But, since they are not required to do such pruning, they may receive TRILL data frames that should have been pruned earlier in the tree distribiton. They silently discard such frames. A campus may contain some Rbridges that prune on VLAN and some which do not. The situation is more complex for multicast. RBridges SHOULD analyze IP derived multicast frames, learn and announce listeners and IP multicast routers for such frames as discussed in Section 4.5 below. And they SHOULD prune the distribution of IP derived multicast frames based on such learning and announcements. But, as with VLANs, they are not required to prune and, unlike VLANs, they are not required to learn. A campus may contain a fixture of Rbridges with different levels of IP derived multicast optimization. An RBridge may receive IP derived multicast frames that should have been pruned earlier in the tree distribiton. They silently discard such frames. An RBridge that does not examine IP derived native multicast frames that it ingresses MUST advertise that it has IPv4 and IPv6 IP multicast routers attached for all the VLANs for which it is a DRB. It need not advertise any IP derived multicast listeners. This will cause all IP derived multicast traffic to be sent to this RBridge for those VLANs. It then egresses that traffic onto the links for which it is DRB where the VLAN of the traffic matches the VLAN for which it is DRB on that link. This may cause the suppression of certain IGMP membership report messages from end stations but that is not significant as any multicast traffic such reports would be requesting will be sent to such end stations under these circumstances. When an IP derived multicast frame is fully examined at ingress, the V field of the TRILL header is set to indicate the pruning which should apply (see Section 4.4.2.2.2). If this analysis is not performed at ingress, V will be zero in the TRILL data frame. Transit RBridges may distribute such a multi-destination frame without pruning, or perform full or partial analysis of the frame, possibly set V, and forward based on such analysis. See also "Considerations for Internet Group Management Protocol (IGMP) and Multicast Listener Discovery (MLD) Snooping Switches" [RFC4541]. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 37] INTERNET-DRAFT RBridge Protocol 4.5 IGMP, MLD, and MRD Learning RBridges SHOULD learn, based on seeing IGMP [RFC3376], MLD [RFC2710], and MRD [RFC4286] frames, which multicast messages should be forwarded onto which links. An IGMP or MLD membership report received in native form from a link indicates a multicast group listener for that group on that link. An IGMP or MLD query or an MRD advertisement received in native form from a link indicates the presence of an IP multicast router on that link. IP multicast group membership reports have to be sent throughout the campus to all IP multicast routers, distinguishing IPv4 and IPv6. All multicast traffic must also be sent to all IP multicast routers for the same version of IP. IP multicast data SHOULD only be sent on links where there is either an IP multicast router for that IP type (IPv4 or IPv6) or an IP multicast group listener for that IP type and IP multicast derived MAC address. RBridges do not need to announce themselves as listeners to the All- Snoopers multicast group, used for MRD reports, because the IP multicast address for that group is in the range where frames sent to such addresses must be broadcast. See also "Considerations for Internet Group Management Protocol (IGMP) and Multicast Listener Discovery (MLD) Snooping Switches" [RFC4541]. 4.6 Learning End Station Addresses RBridges have to learn the MAC addresses and VLANs of their locally attached end stations for link/VLAN pairs for which they are the Designated RBridge so they can o forward the native form of incoming TRILL data frames onto the correct link and o decide for an incoming native unicast frame from a link, where the RBridge is the DRB, whether the frame is - known to have been destined for another end station on the same link, so the RBridge need do nothing, or - know to be destined for another end station on another local link where the RBridge is DRB so it can be directly forwarded R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 38] INTERNET-DRAFT RBridge Protocol in native form or - neither of the above, so the frame has to be converted to a TRILL data frame and forwarded. RBridges have to learn the MAC addresses of remote end stations and the remote RBridge that is the DRB for each such remote end station that are on the same VLAN or VLANs they are. That way, when they need to forward a locally received native unicast frame, after converting it to a TRILL data frame, they can frequently unicast it appropriately rather than always having to flood it. There are three ways an RBridge can learn end station addresses as follows: 1. From the observation of data, learning the { source MAC, VLAN, port } triplet of received native frames and the { source MAC, VLAN, remote RBridge nickname } triplet of data frames that it decapsulates. 2. By running a per VLAN IS-IS instance which receives remote information and transmits local information. 3. By management configuration. RBridges MUST implement capability 1 above and MUST use it unless configured, for one or more particular VLANs and or ports, to not learn from either received local native frames or from decapsulated TRILL data frames or both. RBridges MAY implement capability 2 above. If implemented, such a per VLAN IS-IS instance is run only when the RBridge is configured to do so on a per VLAN basis. Entries in the table of learned MAC addresses and ancillary information also have a one byte unsigned confidence level associated with each entry. Such information learned from the observation of data has a confidence of 1 unless configured to have a different confidence. Such information received via IS-IS is accompanied by a confidence level in the range 0 to 254. Such information configured by management defaults to a confidence level of 255 but may be configured to have another value. When a new learned address and related information are to be entered into the local database there are several possibilities: o If this is a new address, the information is entered accompanied by the confidence level. o If there is already an entry for this address with the same R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 39] INTERNET-DRAFT RBridge Protocol accompanying information, the confidence level in the local database is set to the maximum of its existing confidence level and the confidence level with which it is being learned. o If there is already an entry for this address with different information, the learned information is ignored unless it is being learned with higher or equal priority than the database entry. 4.7 Shared VLAN Learning Although outside the scope of this specification, there are some features in which a set of VLANs is considered to be a group, where one of the VLANs is the "primary" and the other VLANs in the group are "secondaries". An example of this is where traffic from different communities are separated using VLAN tags, and yet some resource (such as an IP router or DHCP server) is to be shared by all the communities. A method of implementing this feature is to give a VLAN tag, say Z, to a link containing the shared resource, and have the other VLANs, say A, C, and D, be part of the group {primary=Z, secondaries = A, C, D}. An RBridge, aware of this grouping, attached to one of the secondary VLANs in the group also claims to be attached to the primary VLAN. So an RBridge attached to A would claim to also be attached to Z. An RBridge attached to the primary would claim to be attached to all the VLANs in the group. This specification does not specify how VLAN groups might be used. Only RBridges that participate in a VLAN group will be configured to know about the VLAN group. However, to detect misconfiguration, an RBridge configured to know about a VLAN group SHOULD report the VLAN group in its LSP. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 40] INTERNET-DRAFT RBridge Protocol 5. Pseudo Code WARNING: The Psuedo Code below has NOT been update to correspond to changes made in other Sections of this document but rather corresponds to draft version -04. This section provides partial high level pseudo code for the processing of all possible types of received and generated frames. In case of conflict between this section and any of the earlier sections in this document, the pseudo code is authoritative. Frame destination address abbreviations used in this section are as follows: Abbreviation Destination Address(es) ------------------------------------------------------------ 802MUL Multicast address in the range 01-80-C2-00-00-00 to 01-80-C2-00-00-0F. ALLRB The All-Rbridges multicast address, . BROAD The broadcast address: FF-FF-FF-FF-FF-FF IP4MUL IPv4 based multicast addresses (the range 00-01-5E-00-00-00 to 00-01-5E-7F-FF-FF) IP6MUL IPv6 based multicast addresses (the range 33-33-00-00-00-00 to 33-33-FF-FF-FF-FF) OTHERM Multicast addresses other than ALLRB, IP4MUL, IP6MUL, or 802MUL. OTHERU Unicast addresses other than SELF. SELF The unicast address of the Rbridge at which an operation is occurring. Section 5.1 below discusses 802MUL addressed frames, most of which are handled by the Ethernet port and are partially or fully out of scope for TRILL. Section 5.2 then discusses other received frames and frames emitted in direct response to such other received frames. Section 5.3 discusses spontaneously emitted frames. 5.1 802MUL Destination Frames Frames addressed to an 802MUL multicast address are usually handled by a port under IEEE 802 protocols which are out of scope for RBridges proper as show in Figure 9. Such frames, by definition, are not forwarded by 802.1 bridges and thus are not forwarded by RBridges. An RBridge MAY learn source MAC address from such frames as described in Section 5.2.2.1. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 41] INTERNET-DRAFT RBridge Protocol +------------------------------------------+ | +--------+ +---------+ | | | Port | | Rbridge | | | | | RBridge | Proper | | | | +--+ | | | | | | \| | | | | | ---------------------+ | | | | | | | /| |/ | | \ | | 802MUL Frames | | | +- | - - - - - - - - - - - - | | / | | | |\ | | / | | ---------------------+ | | | | | \ | | | | | | | | | | +--+ | | +---+ | | | \ | \| | \| | | | ----------------- | -------| -------------------+ | | | | / | /| | /| | | | Other Frames | | | Other Frames | | | | | / | | / | / | | | | | ----------------- | -------| -------------------+ | | | \ | | \ | \ | | | | | | | | | +---+ | | | +--------+ +---------+ | +------------------------------------------+ Figure 9. 802MUL and RBridge Frames The following table give the sections where the various protocols which use 802MUL multicast addresses are discussed: Address Section and Description --------------------------------------------------------- 01-80-C2-00-00-00 5.1.1 All Bridges: Used for BPDUs. 01-80-C2-00-00-01 5.1.2 [802.3] Clause 31 01-80-C2-00-00-02 5.1.2 [802.3] Clause 43 ( Link Aggregation) and Clause 57 (OAM) 01-80-C2-00-00-03 5.1.3 [802.1X] Port Authenticator Entity (PAE) 01-80-C2-00-00-04/5 5.1.2 Reserved. 01-80-C2-00-00-06/7 5.1.6 Reserved. 01-80-C2-00-00-08 5.1.6 All Provider Bridges 01-80-C2-00-00-09/C 5.1.6 Reserved. 01-80-C2-00-00-0D 5.1.5 Provider Bridge GVRP Address 01-80-C2-00-00-0E 5.1.4 [802.1AB] Link Layer Discovery Protocol 01-80-C2-00-00-0F 5.1.2 Reserved. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 42] INTERNET-DRAFT RBridge Protocol 5.1.1 Spanning Tree Protocol Frames sent with the All Bridges multicast address use the bridge spanning tree protocol NSAP 0x42 so the frame begins LL- LL-42-42-03-00-00 where 0xLLLL is the length and the trailing 0x0000 indicate the frame is a BPDU (Bridge Protocol Data Unit), used to implement the spanning tree protocol (see also Section 5.1.5). RBridge ports MUST adopt one of four strategies as listed below in connection with these frames and SHOULD adopt strategy 2. Note: It is never the case that a bridging spanning tree extends through an RBridge between two of its ports. Those ports always terminate the spanning tree. 1. An RBridge port MAY silently discard all received BPDUs and not issue an BPDUs. 2. An RBridge port SHOULD examine received BPDUs to determine the current root bridge and advertise what it sees as the current root bridge on that port via the core IS-IS instance (see Section 4.2.3). It would be sufficient for the RBridge to test that the DSAP/SSAP are 0x4242 and the first four octets of the BPDU payload are zero. If so, the spanning tree root bridge identifier is the eight octets from the sixth octet through the 13th octet. (The fifth octet is an octet of flags that need not be examined by the RBridge.) The last six of these eight octets are the part of the root identifier reported in the LSP.(Octets six and seven include a priority.) 3. An RBridge port MAY participate in spanning tree in such a way as to become spanning tree root if it should be the Designated RBridge. See Section 4.2.4. 4. As an alternative to item 3, an RBridge port may optionally participate in spanning tree in such a way as to force an attached bridged LAN to partition as discussed in Section 6.2, 5.1.2 Media Multicast Frames These frames are for media specific port features or are reserved for the future standardization of such features. Such features are outside of the scope of TRILL which is generally media independent. 5.1.3 802.1X Frames This port protocol provides for the authentication of end stations as specified in [802.1X]. That an end station has been so authenticated R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 43] INTERNET-DRAFT RBridge Protocol MAY be used to increase the confidence in end station MAC addresses reported via the optional per VLAN IS-IS instance (see Section 4.6). An amendment to 802.1X [802.1af] is under development such that 802.1X authentication would produce keying material usable in [802.1AE] tags which can in turn be used to authenticate and encrypt frames between ports. 5.1.4 802.1AB Frames Frames with this multicast address are used in the Station and Media Access Control Connectivity Discovery standard 802.1AB [802.1AB] which specifies the Local Link Discovery Protocol (LLDP). These frames are also identified by the Ethertype 0x88CC. This protocol is generally outside of the scope of TRILL. However, if LLDP frames containing the System Capabilities 802.1AB TLV are issued by an RBridge port, it is RECOMMENDED that the "bridge" bit be asserted in the "system capabilities" subfield and if that port is participating in spanning tree (see Section 5.1.1), then it is RECOMMENDED that the "bridge" bit be asserted in the "enabled capabilities" subfield. 5.1.5 GARP, GMRP, and GVRP IEEE [802.1D] bridging defines a Generic Attribute Registration Protocol, GARP, on which a GARP Multicast Registration Protocol, GMRP, and a GARP VLAN Registration Protocol, GVRP, are based. GARP uses the bridge spanning tree protocol NSAP 0x42 so the frame begins LL-LL-42-42-03-00-01 where 0xLLLL is the length and the trailing 0x0001 indicate the frame is a GARP PDU (see also Section 5.1.1). The multicast addresses in the range 01-80-C2-00-00-20 to 01-80-C2-00-00-2F have been reserved for GARP applications. [802.1D] requires that bridges transparently propagate frames to any multicast address in this range if they do not implement the corresponding GARP application. Since RBridges do not implement any of these applications, they treat such frames as any other layer 2 multicast. The GMRP application of GARP uses multicast address 01-80-C2-00-00-20. It would provide a basis for the optimization of the distribution of frames with all layer 2 multicast addresses. However, RBridges provide for IP based multicast optimization instead. The GVRP application of GARP uses multicast address R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 44] INTERNET-DRAFT RBridge Protocol 01-80-C2-00-00-21. It provides for the registration of VLANs and is not supported by RBridges. 5.1.6 Other Bridge Frames These frames relate to other bridge features outside of the scope of TRILL or are reserved for future standardization. 5.2 Processing a Frame Received by an RBridge "Ethertype" abbreviations used in this section are as follows: Ethernet Protocol Type Abbreviations ------------------------------------------------------------ **** Any Ethertype. IP** IPv4 or IPv6 message Ethertype. IPv4 0x0800, IP version 4 message Ethertype. IPv6 0x86DD, IP version 6 message Ethertype. ISIS 0xFE, IS-IS Message NSAP value. TRILL , TRILL frame Ethertype. When an Rbridge RB1 receives a frame, it determines the VLAN ID and priority for that frame as described in Section 4.1.3. The VLAN ID and priority are then available as meta data accompanying the frame. The destination address of the received frame, its payload protocol type, and the Designated RBridge status of the receiving RBridge RBn for the link and VLAN in question are then used to sequentially search the table below from the top. As soon as a match is found, the processing indicated (either discard the frame or process as give in the reference) is performed. Of course, any other arrangement of processing incoming frames is fine as long as the results are the same as the pseudo-code in this section. The initial sequential match and dispatch table is as shown below. The "DRB" column is "Y" if the RBridge must be the Designated Rbridge to match, "N" if it must not be the Designated Rbridge, and "*" if it does not matter. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 45] INTERNET-DRAFT RBridge Protocol Dest. DRB Ethertype Section/Description ------------------------------------------------------------ SELF * TRILL 5.2.3 TRILL encapsulated frame. SELF N **** Discard. SELF * **** 5.2.4 Local destination frame. OTHERU * TRILL TRILL encapsulated frame addressed to another Rbridge; discard. OTHERU N **** Discard. OTHERU Y IP** 5.2.1 unicast IP frame. OTHERU Y **** 5.2.4 Other unicast frame. ALLRB * TRILL 5.2.3 TRILL encapsulated frame. ALLRB * **** Erroneous frame; discard. 802MUL * **** 5.1 Should not get here. IPMUL N IP** Discard. IPMUL Y IP** 5.2.1 multicast IP frame. IPMUL * **** Erroneous frame; discard. OTHERM N **** Discard. OTHERM Y **** 5.2.4 non-IP based multicast frame. BROAD * TRILL Erroneous frame but MAY be treated as if Destination was ALLRB (see above). BROAD N **** Discard. BROAD Y IP** 5.2.1 Broadcast IP frame. BROAD Y **** 5.2.4 Other broadcast frame. 5.2.1 Further Dispatch for IP Frames Frames containing IP (Internet Protocol) payload, both IPv4 and IPv6, are treated in different ways depending on the particular protocol within IP which they are carrying. The following table is searched sequentially from the top and the first match used. The "Ver." column is the version of IP used in the frame and "Proto" is the Payload IP protocol for the frame. Ver. Proto Section/Description ------------------------------------------------------------ IPv4 IGMP 5.2.5 Internet Group Membership Protocol IPv6 MLD 5.2.5 Multicast Listener Discovery IP** MRD 5.2.6 Multicast Router Discovery IP** PIM 5.2.6 Protocol Independent Multicast IP** **** 5.2.4 Other 5.2.2 Common Subroutines The following subroutines are called from several places in Section 5. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 46] INTERNET-DRAFT RBridge Protocol 5.2.2.1 Learn Source MAC Address This is a pseudo-code subroutine called several places above. Note that if this is called more than once for the same frame, all calls after the first have no effect and do not actually have to be performed. if (Outer.MacSA has the "group" bit off) { Learn Outer.MacSA for the port on which the frame was received for the determined VLAN unless configured not to do such learning. } 5.2.2.2 TRILL Data Frame Multi-destination Forwarding if (RFP check fails on Outer.MacSA (Section 4.3.1)) { Exit; /* do not forward the frame */ } else { Execute Section 5.2.2.3; } Outer.MacSA = RBn; Forward along tree indicated by Trill.EgressNickname, pruned as specified in Section 4.3.2. 5.2.2.3 TRILL Data Frame Outer VLAN Tag if ( (Inner VLAN priority != 0 ) or (Inner VID != 1 ) or (configured to always use Outer VLAN Tag) ) { Create Outer VLAN Tag if none present. Outer VLAN Tag priority = Inner VLAN Tag priority; Outer VLAN Tag VID = Inner VLAN Tag VID; } else { Remove Outer VLAN Tag if present; } R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 47] INTERNET-DRAFT RBridge Protocol 5.2.3 TRILL Ethertype Frames /* Dispatch on the TRILL message variation */ /* The RBridge performing the processing is RBn */ if (Variation > 1) { Discard the frame, unknown format. } elseif (Variation == 1) /* IS-IS */ { Execute Section 5.2.3.1. } else /* Variation == 0, Data */ { Execute Section 5.2.3.2. } 5.2.3.1 TRILL IS-IS Frames if (Outer.MacDA == All-RBridges) /* Note: if Outer.MacDA is OTHERM, discarded by dispatch table above */ { if ( (Multi-Destination == 0) or (Inner.Protocol Type != ISIS ) or (Outer.MacSA != a tree adjacency for tree indicated) ) { Discard the frame. } elseif (inner VLAN tag not present) { Process payload as a core TRILL IS-IS message for RBn. Note: nicknames may be invalid, ignore them. } else /* inner VLAN tag present */ { If RBn has end stations on links for which it is the DRB on the indicated VLAN, give the IS-IS message to the per VLAN IS-IS instance if implemented. Trill.HopCount -= 1 /* at least, see Section 3.4*/ if (Trill.HopCount <= 0) { Discard the frame. } R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 48] INTERNET-DRAFT RBridge Protocol else { if (RFP check fails on Outer.MacSA (Section 4.3.1)) { Exit; /* do not forward frame */ } else { Outer.MacSA = RBn; Create Outer VLAN Tag if none present (could have been stripped by a bridge). Outer VLAN Tag priority = 7; Outer VLAN Tag VID = Inner VLAN Tag VID; Forwards along tree indicated by Trill.EgressNickname, VLAN pruned as specified in section 4.3.2. } } } } else /* Outer.MacDA == SELF. Note: if Outer.MacDA is OTHERU, discarded by dispatch table above */ { if ( (Multi-Destination == 1) or (Inner.Protocol Type != ISIS ) ) { Discard the frame. } elseif (inner VLAN tag not present) { if (Inner.MacDA != SELF) { Discard the frame. } Process the core instance unicast IS-IS message on RBn. Note: nicknames may be invalid, ignore them. } else /* inner VLAN tag present */ { if (Inner.MacDA == SELF) { Process the per VLAN instance IS-IS message on RBn for the indicated VLAN. } else { Trill.HopCount -= 1; R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 49] INTERNET-DRAFT RBridge Protocol if (Trill.HopCount == 0) { Discard the frame. } else { Outer.MacDA = lookup (Trill.EgressNickname); Outer.MacSA = RBn; Create Outer VLAN Tag if none present (could have been stripped by a bridge). Outer VLAN Tag priority = 7; Outer VLAN Tag VID = Inner VLAN Tag VID; and forward the frame to the next hop RBridge } } } } 5.2.3.2 TRILL Data Frames if (Outer.MacDA == All-RBridges) { if ( (Multi-Destination == 0) of (Outer.MacSA != a tree adjacency for tree indicated) ) { Discard the frame. } else { If RBn is a DRB for the indicated VLAN, decapsulate the data frame and forward in onto appropriate links in the VLAN. Trill.HopCount -= 1 /* at least, see Section 3.4 */ if (Trill.HopCount == 0) { Discard the frame. } else { Execute Section 5.2.2.2. } } R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 50] INTERNET-DRAFT RBridge Protocol else /* Outer.MacDA == SELF */ { if (Trill.EgressNickname == RBn) { Convert to native format and forward the extracted frame onto the link containing the destination or locally process the frame if the Inner.MacDA == RBn. } else { /* The frame needs to be forwarded to another RBridge */ Trill.HopCount -= 1; if (Trill.HopCount == 0) { Discard the frame. } else { Execute Section 5.2.2.3; if (Trill.EgressNickname unknown) { Discard the frame. } Outer.MacDA = lookup (Trill.EgressNickname); Outer.MacSA = RBn; and forward the frame } } 5.2.4 Native Frame Receipt The following pseudo code is executed for frames that are not of the TRILL Ethertype and are received on a port and VLAN for which the RBridge is the Designated RBridge (see Section 4.2.4). Learn source MAC address as specified in 5.2.2.1. if (Outer.MacDA == SELF) { A native frame for the RBridge received from a local link, for example a management protocol frame from a directly connected management station. Process locally. } R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 51] INTERNET-DRAFT RBridge Protocol elseif (Outer.MacDA == a known unicast address) { if (Outer.MacDA is on the directly connected link on which the frame was received) { Discard the frame. Destination has already seen it. } elseif (Outer.MacDA for the determined VLAN is on another directly connect link) { Forward the native frame out the port for that link. } else { Assume that the egress RBridge is RBm. Outer.MacDA = next hop RBridge (in the path to RBm); Outer.MacSA = RB1; Outer.Ethertype = TRILL; Trill.V = 0; Trill.Reserved = 0; Trill.M = FALSE; /* this is not multi-destination */ Trill.HopCount = determined value (see Section 3.4); Trill.EgressNickname = RBm; Trill.IngressNickname = RBn; Followed by the received frame; Create/update Inner VLAN Tag with VID and priority determined as specified in Section 4.1.2. Execute Section 5.2.2.3; Forward on the port for the Outer.MacDA. } else { // unknown unicast or general multicast or broadcast Forward to other links where RBn is the DRB for the indicated VLAN. Outer.MacDA = All-Rbridges; Outer.MacSA = RB1; Outer.Ethertype = TRILL; Trill.V = 0; Trill.Reserved = 0; Trill.M = TRUE; /* this is a multi-destination frame */ Trill.HopCount = determined value (see Section 3.4); Trill.EgressNickname = RBi /* Distribution Tree, See below */ Trill.IngressNickname = RB1; Followed by the received frame with the appropriate Create/update Inner VLAN Tag with VID and priority determined as specified in Section 4.1.2. Execute Section at 5.2.2.3; } R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 52] INTERNET-DRAFT RBridge Protocol In the last case above, the egress nickname indicates the chosen distribution tree RBi. The default is for RB1 to put its own address there. However, if RB1 is configured to decline to be a tree root, then RB1 MUST select some other RBridge RBi which has elected to be a tree root or the RBridge with the lowest ID if none have elected to be a tree root. 5.2.5 IGMP and MLD Frames An IGMP (IPv4 [RFC3376]) or MLD (IPv6 [RFC2710]) announcement received from a link by the designated RBridge, teaches RBn a group membership on that link. The RBridge adds receiver for that layer 2 group address in the appropriate VLAN in its core link state instance. Then execute Section 5.2.4. 5.2.6 PIM and MRD Frames A PIM or MRD [RFC4286] message received from a link by the designated RBridge teaches RBn that there is an IP multicast router (for the determined VLAN) on its link, and adds that information into its core IS-IS link state information for that VLAN. Then execute Section 5.2.4. 5.3 Frames Spontaneously Sourced The sections below discuss all frames that might be spontaneous sourced by an RBridge. 5.3.1 IS-IS Frames Sourced An RBridge R1 MUST spontaneously emit core instance TRILL IS-IS frames as described in 5.3.1.1. In addition, if it is DRB for a link that has end stations in a particular VLAN, it MAY run an IS-IS instance for that VLAN and emit TRILL IS-IS frames as described in 5.3.1.2. Do not confuse the per VLAN DRB determination, which is done by the core IS-IS instance, with the optional per VLAN IS-IS instances used to distribute end station addresses. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 53] INTERNET-DRAFT RBridge Protocol 5.3.1.1 Core IS-IS Frames For core IS-IS frames, a V=1 TRILL header is added and no VLAN tag is included in the inner frame. Note that, in a strict sense, IS-IS has no Ethertype but the 802.3 LLC NSAP format MUST be used, that is LL- LL-FE-FE-03 where 0xLLLL is the length and 0x03 is the CTL byte. (The IS-IS standard also permits the less efficient SNAP SAP format LL-LL- AA-AA-03-00-00-00-80-FE which is not used in TRILL.) If the frame is multicast, it is formed as follows: Outer.MacDA = All-RBridges; Outer.MacSA = RB1; Outer.Ethertype = TRILL. Trill.V = 1; Trill.M = 1; Trill.HopCount = 1; Trill.IngressNickname = 0; Trill.EgressNickname = 0; Inner.MacDA = All-RBridges; Inner.MacSA = RB1; Inner.FrameLength = IS-IS frame length Inner.DSAP = 0xFE; Inner.SSAP = 0xFE. Inner.CTL = 3; followed by the rest of the IS-IS Frame. The frame is then sent out ports of the RBridge so as to get to every adjacent RBridge. For each port not either known to be a point-to- point connection to an Rbridge or configured not to use Outer VLAN Tags, an Outer VLAN Tag is added as follows: Outer VLAN Tag priority = 7; Outer VLAN VID = VID associated with the logical port on which the frame is being sent or zero if none. Note that this Outer VLAN Tag may be different on different ports. Currently all IS-IS messages are multi-cast. However, if it were necessary to send a unicast core instance TRILL IS-IS message, it would be formatted as follows: Outer.MacDA = DestinationRBridge; Outer.MacSA = RB1; Outer.Ethertype = TRILL. Trill.V = 1; Trill.M = 0; Trill.HopCount = 1; Trill.IngressNickname = 0; Trill.EgressNickname = 0; R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 54] INTERNET-DRAFT RBridge Protocol Inner.MacDA = DestinationRBridge; Inner.MacSA = RB1; Inner.FrameLength = IS-IS frame length Inner.DSAP = 0xFE; Inner.SSAP = 0xFE. Inner.CTL = 3; followed by the rest of the IS-IS Frame. The frame is then transmitted on the port for DestinationRBridge with an Outer VLAN Tag possibly added using the same logic as for a multi- cast TRILL IS-IS frame. 5.3.1.2 Per-VLAN IS-IS Frames For per VLAN TRILL IS-IS frames, a V=1 TRILL header is added and a VLAN tag is always included in the inner frame. Note that, in a strict sense, IS-IS has no Ethertype but the 802.3 NSAP format must be used as discusses at the start of section 5.3.1.1. If the frame is per VLAN multicast, it is formed as follows: Outer.MacDA = All-RBridges; Outer.MacSA = RB1; Outer.Ethertype = TRILL. Trill.V = 1; Trill.M = 1; Trill.HopCount = count to reach farthest node in the distribution tree; Trill.IngressNickname = 0; Trill.EgressNickname = SelectedDistributionTree; Inner.MacDA = All-RBridges; Inner.MacSA = RB1; Ethertype = VLAN Tag; Inner VLAN Tag priority = 7; Inner VLAN Tag VID = Relevant VLAN; Inner.FrameLength = IS-IS frame length Inner.DSAP = 0xFE; Inner.SSAP = 0xFE. Inner.CTL = 3; followed by the rest of the IS-IS Frame. The frame is then sent out the ports appropriate for the selected distribution tree pruned to the selected VLAN. For each port not either known to be a point-to-point connection to an RBridge or configured not to use Outer VLAN Tags, an Outer VLAN Tag is added as follows: Outer VLAN Tag priority = 7; R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 55] INTERNET-DRAFT RBridge Protocol Outer VLAN VID = VID associated with the logical port on which the frame is being sent or zero if none. Note that this Outer VLAN Tag may be different on different ports. Currently all IS-IS messages are multicast. However, if it were necessary to send a unicast per VLAN instance TRILL IS-IS message, it would be formatted as follows: Outer.MacDA = NextHopRBridge; Outer.MacSA = RB1; Outer.Ethertype = TRILL. Trill.V = 1; Trill.M = 0; Trill.HopCount = determined value (see Section 3.4); Trill.IngressNickname = 0; Trill.EgressNickname = DestinationNickname; Ethertype = VLAN Tag; Inner VLAN Tag priority = 7; Inner VLAN Tag VID = Relevant VLAN; Inner.MacDA = DestinationRBridge; Inner.MacSA = RB1; Inner.FrameLength = IS-IS frame length Inner.DSAP = 0xFE; Inner.SSAP = 0xFE. Inner.CTL = 3; followed by the rest of the IS-IS Frame. The frame is then transmitted on the port for NextHopRBridge with an Outer VLAN Tag possibly added using the same logic as for a multi- cast TRILL IS-IS frame. 5.3.2 Other Frames Sourced Other frames may be sourced due to management protocols or general applications running on an RBridge. These can be handled as if they were received by the RBridge on a port for which it was the Designated RBridge and on which there were no know directly connected stations as described in Section 5.2.4. WARNING: The Psuedo Code above has NOT been update to correspond to changes made in other Sections of this document but rather corresponds to draft version -04. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 56] INTERNET-DRAFT RBridge Protocol 6. Incremental Deployment Considerations Because RBridges are compatible with current IEEE 802.1 bridges, a LAN can be upgraded by incrementally replacing bridges with RBridges. Any remaining bridges are invisible to RBridges and the physical links directly interconnected by such bridges, which together with the bridges constitute a bridged LAN, appear to RBridges to be a single multi-access link. If the bridges that were replaced by RBridges were un-managed, zero configuration bridges, then the RBridge replacements will not require configuration. Section 6.1 further explores general incremental deployment considerations while Section 6.2 shows a particular example. 6.1 Incremental Deployment The campus will work best if all IEEE 802.1 bridges are replaced with RBridges, assuming the RBridges have the same basic speed and capacity as the bridges. However, there may be intermediate states, where only some bridges have been replaced by RBridges. In particular, assume the RBridges partition a bridged LAN into a relatively small number of relatively large remnant bridged LANs. Then two potential problems may occur as follows: 1. The requirement that end station frames enter and leave a link via the Designated RBridge for the link can cause congestion or suboptimal routing. The extent to which such a problem will occur is highly dependent on the network topology. For example, if a bridged LAN had a star-like structure with core bridges that connected only to other bridges and peripheral bridges that connected to end stations and singly connected to a core bridge, the replacement of all of the core bridges by RBridges without replacing the peripheral bridges would generally improve performance without inducing any Designated RBridge congestion. 2. TRILL traffic sent to the All-Rbridge multicast address will typically be flooded throughout a bridged LAN link which may create a greater burden than necessary. In cases where there is actually only one intended RBridge next hop recipient, this problem can be eliminated by using the option of sending the TRILL traffic that woukd otherwise be multicast as a unicast frame to that recipient. Inserting RBridges so that all the bridged portions of the LAN stay connected to each other is generally the least efficient arrangement. There are four techniques which may help if problem 1 above occurs R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 57] INTERNET-DRAFT RBridge Protocol and which can, to some extent, be used in combination: 1. Replace more IEEE 802.1 bridges with RBridges so as to minimize the size of the remnant bridged LANs between RBridges. This requires no configuration of the RBridges unless the bridges they replace required configuration. 2. Re-arrange network topology to minimize the problem. If the bridges and RBridges involved are configured, this may require changes in their configuration. 3. Configure the RBridges and bridges so that end stations on a remnant bridged LAN are separated into different VLANs that have different Designated RBridges. If the end stations were already assigned to different VLANs, this is straightforward (see Section 4.2.4). If the end stations were on the same VLAN and have to be split into different VLANs, this technique may lead to connectivity problems between end stations but it may be possible to overcome these problems using shared VLANs (see Section 4.7). 4. Configure the RBridges such that their ports which are connected to the bridged LAN participate in the bridged LAN's spanning tree in such a way as to force the partition of the bridged LAN. (Note: a spanning tree is never formed through an RBridge but always terminates at RBridge ports.) To use this technique, the RBridges must support this optional feature, which is discussed further in Section 6.2, and would need to be configured to make use of it but the bridges involved would rarely have to be configured. Warning: This technique makes the bridged LAN unavailable for RBridge through traffic because the bridged LAN partitions. Conversely to item 3 above, there may be bridged LANs which use VLANs, or use more VLANs than would otherwise be necessary, to evade the congestion that can be caused by the spanning tree algorithm. Replacing the IEEE 802.1 bridges in such LANs with RBridges may enable a reduction in or elimination of VLANs and configuration. 6.2 Wiring Closet Topology If 802.1 bridges are present and RBridges are not configured, the bridge spanning tree or the Designate RBridge election may make inappropriate decisions. Below is a detailed example of the more general problem that can occur when a bridge LAN is connected to multiple RBridges (see Section 6.1). For example, in cases where there are two (or more) groups of end nodes, each attached to a bridge (say B1 and B2 respectively), and each bridge is attached to an RBridge (say RB1 and RB2 respectively), R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 58] INTERNET-DRAFT RBridge Protocol with a additional link connecting B1 and B2 (see Figure 10), it may be desirable to have the B1-B2 link only as a backup in case one of RB1 and RB2, or one of the links B1-RB1 or B2-RB2 fail. +-------------------------------+ | | | | | Data +-----+ +-----+ | | Center -| RB1 |----| RB2 |- | | +-----+ +-----+ | | | | | +-------------------------------+ | | | | +-------------------------------+ | | | | | +----+ +----+ | | Wiring | B1 |-----| B2 | | | Closet +----+ +----+ | | | +-------------------------------+ Figure 10. Wiring Closet Topology For example, B1 and B2 may be in a wiring closet and it may be easy to provide a very short high bandwidth low cost link between them while RB1 and RB2 are at a distant data center such that the RB1-B1 and RB2-B2 links are slower and more expensive. Default behavior would be that one of RB1 or RB2 (say RB1) would become Designated RBridge, and forward traffic to/from the link, so end nodes attached to B2 would be connected to the campus via the path B2-B1-RB1, rather than the desired B2-RB2. This wastes the bandwidth of the B2-RB2 path and cuts available bandwidth between the end stations and the data center in half. The desired behavior would probably be to make maximum use of both the RB1-B1 and RB2-B2 links. 6.2.1 The RBridge Solution Of course, if B1 and B2 are replaced with RBridges, the right thing will normally happen with zero configuration, but this may not be immediately practical if bridges are being incrementally replaced by RBridges. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 59] INTERNET-DRAFT RBridge Protocol 6.2.2 The Spanning Tree Solution Another solution is to configure RB1 and RB2 to be part of a "wiring closet group", with a configured System ID RBx (which may be RB1 or RB2's System ID). Both RB1 and RB2 participate in the bridge spanning tree on the configured ports as root RBx, which causes the spanning tree to partition the bridged LAN and break the B1-B2 link as desired, and both RB1 and RB2 act as Designated RBridge on each of their respective partitions. Of course, with the partition, no RBridge through traffic can flow over the RB1-B1-B2-RB2 path. In the BPDU, the Root is "RBx", cost to Root is 0, Designated Bridge ID is "RB1" when R1 transmits and "RB2" when R2 transmits, and port ID is a value chosen independently by each of RB1 and RB2 to distinguish each of its own ports. If RB1 and RB2 were actually on the same shared medium with no bridges between them, the result is that the one with the larger ID sees "better" BPDUs (because of the tie-breaker on the third field; the ID of the transmitting RBridge), and turns off the port. Should either the RB1 or the RB1-B1 link or RB2 or the RB2-B2 link fail, the spanning tree algorithm will stop seeing one of the RBx roots and will re-enable the B1-B2 link maintaining connectivity of all the end stations with the data center. If the link RB1-B1-B2-RB2 is on the cut set of the campus and RB2 and/or RB1 have been configured to believe they are part of a wiring closet group the campus becomes partitioned as the link partitions. 6.2.3 The VLAN Solution If the end stations attached to B1 and B2 are already divided among a number of VLANs, RB1 could be configured to have higher priority to become DRB on some of these VLANs and RB2 configured to have higher priority on the others. Should either of the RBs fail or become disconnected, the other will become DRB for all the VLANs. If the end stations are all on a single VLAN, perhaps the default VLAN 1, then it would be necessary to arbitrarily assign them between at least two VLANs to use this solution. This may lead to connectivity problems which might require further measures, outside the scope of this specification, to rectify. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 60] INTERNET-DRAFT RBridge Protocol 6.2.4 Comparison of Solutions Replacing all 802.1 bridges with RBridges is usually the best solution with the least amount of configuration required, possibly none. The spanning tree solution does quite well in this particular case. But it depends on both RB1 and RB2 having implemented the optional feature of being able to configure a port to participate in spanning tree as described in Section 6.2.2 above. It also makes the bridged LAN whose partition is being forced unavailable for through traffic Finally, while in this specific example it neatly breaks the link between the two bridges B1 and B2, if there were a more complex bridged LAN, instead of exactly two bridges, there is no guarantee that it would partition into roughly equal pieces. In such a case, you might end up with a highly unbalanced load on the RB1 link and the RB2 link. The VLAN solution works well with a relatively small amount of configuration if the end stations are already divided among a number of VLANs. If they are not, it becomes more complex and problematic. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 61] INTERNET-DRAFT RBridge Protocol 7. RBridge Addresses, Parameters, and Constants IS-IS requires each RBridge to have a unique 6-byte System ID. This is easily obtainable, e.g., as any one of the 6-byte MAC addresses owned by that RBridge. A new Ethertype must be assigned to indicate a TRILL encapsulated frame. A layer 2 multicast address for All-RBridges must be assigned for use as the destination address in multi-destination frames. To support VLANs, RBridges (like bridges today), must be configured appropriately. This includes per VLAN priority for becoming DRB and cases where DRB status for a VLAN is determined without a DRB election on that VLAN but rather by copying its DRB status for a different VLAN on which an election was done. RBridges may be configured with a nickname and nickname selection priority. RBridges may be configured to have per VLAN IS-IS instances and to send and/or learn end station address information via such instances. Static end address information and priority of such end station information statically configured and learned in various ways can also be configured. The per RBridge parameter RequestTree that indicates whether an RBridge wants to be the root of a distribution tree. Configuration for wiring closet topology (see Section 6.2) consists of System ID of the RBridge with lowest System ID. If RB1 and RB2 are part of a wiring closet topology, only RB2 needs to be configured to know about this, and that RB1 is the ID it should use in the spanning tree protocol on the specified port. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 62] INTERNET-DRAFT RBridge Protocol 8. Security Considerations Layer 2 bridging in not inherently secure. It is, for example, subject to forgery of source addresses and bridging control messages. A goal for TRILL is that RBridges do not add new issues beyond those existing in current bridging technology. Countermeasures are available such as to configure the RBridge IS-IS instances to use IS-IS security and ignore unauthenticated control messages received on a port. Since such authentication requires configuration, RBridges where it is used are no longer zero configuration. IEEE 802.1 port admission and link security mechanisms, such as [802.1X] and [802.1AE], can also be used. These are best thought of as being implemented within a port and are outside the scope of TRILL proper (just as they are generally out of scope for bridging standards 802.1D and 802.1Q) although TRILL can make use of secure registration through the confidence level communicated in the optional per VLAN IS-IS instance (see Section 4.6). RBridges do not prevent nodes from impersonating other nodes, for instance, by issuing bogus ARP/ND replies. However, RBridges do not interfere with any schemes that would secure neighbor discovery. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 63] INTERNET-DRAFT RBridge Protocol 9. Assignment Considerations This section discuses IANA and IEEE 802 assignment considerations. 9.1 IANA Considerations A new IANA registry is created for TRILL. New TRILL Header Variation numbers are assigned by an IETF Standards Action [RFC2434] as modified by [RFC4020]. 9.2 IEEE 802 Assignment Considerations The Ethertype is assigned by IEEE 802 to indicate a TRILL encapsulated frame. The layer 2 multicast address is assigned by IEEE 802 for "All- Rbridges". R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 64] INTERNET-DRAFT RBridge Protocol 10. Normative References [802.1D] "IEEE Standard for Local and metropolitan area networks / Media Access Control (MAC) Bridges", 802.1D-2004, 9 June 2004. [802.1Q] "IEEE Standard for Local and metropolitan area networks / Virtual Bridged Local Area Networks", 802.1Q-2005, 19 May 2006. [802.3] [ISO10589] ISO/IEC 10589:2002, "Intermediate system to Intermediate system routeing information exchange protocol for use in conjunction with the Protocol for providing the Connectionless-mode Network Service (ISO 8473)," ISO/IEC 10589:2002. [RFC1112] Deering, S., "Host Extensions for IP Multicasting", STD 5, RFC 1112, Stanford University, August 1989. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 2434, October 1998. [RFC2464] - Crawford, M., "Transmission of IPv6 Packets over Ethernet Networks", RFC 2464, December 1998. [RFC2710] Deering, S., Fenner, W., and B. Haberman, "Multicast Listener Discovery (MLD) for IPv6", RFC 2710, October 1999. [RFC3376] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A. Thyagarajan, "Internet Group Management Protocol, Version 3", RFC 3376, October 2002. [RFC4020] Kompella, K. and A. Zinin, "Early IANA Allocation of Standards Track Code Points", BCP 100, RFC 4020, February 2005. [RFC4286] Haberman, B., Martin, J., "Multicast Router Discovery", RFC 4286, December 2005. 11. Informative References [802.1AB] "IEEE Standard for Local and metropolitan area networks / Station and Media Access Control Connectivity Discovery", 802.1AB-2005, 6 May 2005. [802.1AE] "IEEE Standard for Local and metropolitan area networks / Media Access Control (MAC) Security", 802.1AE-2006, 18 August 2006 R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 65] INTERNET-DRAFT RBridge Protocol [802.1X] "IEEE Standard for Local and metropolitan area networks / Port Based Network Access Control", 802.1X-2004, 13 December 2004. [Arch] Gray, E., "The Architecture of an RBridge Solution to TRILL", draft-ietf-trill-rbridge-arch-02.txt, October 2006, work in progress. [PAS] Touch, J., & R. Perlman, "Transparent Interconnection of Lots of Links (TRILL) / Problem and Applicability Statement", draft-ietf- trill-prob-01.txt, October 2006, work in progress. [RBridges] Perlman, R., "RBridges: Transparent Routing", Proc. Infocom 2005, March 2004. [RFC4541] Christensen, M., Kimball, K., and F. Solensky, "Considerations for Internet Group Management Protocol (IGMP) and Multicast Listener Discovery (MLD) Snooping Switches", RFC 4541, May 2006. [RP1999] Perlman, R., "Interconnection: Bridges, Routers, Switches, and Internetworking Protocols", Addison Wesley Chapter 3, 1999. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 66] INTERNET-DRAFT RBridge Protocol Appendix A: Revision History RFC Editor: Please delete this appendix before publication. Changes from -03 to -04 1. Divide IANA Considerations section into IANA and IEEE parts. Add IANA considerations for TRILL Header variations and reserved bit and normative references to RFCs 2434 and 4020. 2. Add note on the terms Rbridge and TRILL to section 1.2. 3. Remove IS-IS marketing text. 4. Split Section 3 into Sections 3 and 4. Add a new top level section "5. Pseudo Code", renumbering following sections. Move pseudo code that was in old Section 3 into Section 4 and make section 3 more textural. This idea is that Section 3 and 4 have more readable text descriptions with some corner cases left out for simplicity while section 5 has more structured and complete coverage. 5. Revised and extended Security Considerations section. 6. Move multicast router attachment bit and IGMP membership report information from the per VLAN IS-IS instance to the core IS-IS instance so the information can be used by core RBridges to prune distribution trees. 7. Remove ARP/ND optimization. 8. Change TRILL Header to add option feature. Add option section. 9. Change TRILL Header to expand Version field to the Variation field. Add TRILL message variations (8 bits) supported to the per RBridge link state information. 10. Distinguish TRILL data and IS-IS messages by using Variation = 0 and 1. 11. Consistently state that VLAN pruning and IP derived multicast pruning of distribution trees are SHOULD. 12. Add text and pseudo code to discard TRILL Ethertype data frames received on a port that does not have an IS-IS adjacency on it. 13. Specify end station address learning from decapsulated native frames. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 67] INTERNET-DRAFT RBridge Protocol 14. Add nickname allocation priority and optional nickname configuration. Reserve nickname values zero and 0xFFFF. 15. Explain about multiple Designated RBridges because of multiple VLANS. 14. Add Incremental Deployment Considerations Section incorporating expanded Wiring Closet Topology Section. 15. Add end station address learning section. 16. Add more detail on VLAN tag information and material on frame priority. 17. Miscellaneous minor editing and terminology updates. Changes from -04 to -05 NOTE: Section 5 was NOT updated as indicated below but the remainder of the draft was so updated. 1. Mention optional VLAN and multicast optimization in Abstract. 2. Change to distinguish TRILL IS-IS from TRILL data frames based on the Inner.MacDA instead of a TRILL Header bit. 3. Split IP multicast router attached bit in two so you can separately indicate attachment of IPv4 and IPv6 routers. Provide that these bits must be set if an RBridge does not actually do multicast control snooping on ingressed traffic. 4. Add the term "port VLAN ID" (PVID). 5. Drop references to PIM. Improve discussions of IGMP, MLD, and MRD messages. 6. Move M bit over one and create two bit pruning field at the bottom of the "V" combined field. 7. Add pruning control values of V and discussion of same. 8. Permit optional unicast tranmission of multi-destination frames when there is only one received out a port. 9. Miscellaneous minor editing and terminology updates. NOTE: Section 5 was NOT updated as indicated above but the remainder R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 68] INTERNET-DRAFT RBridge Protocol of the draft was so updated. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 69] INTERNET-DRAFT RBridge Protocol Disclaimer This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Additional IPR Provisions The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf- ipr@ietf.org. Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Authors' Addresses Radia Perlman Sun Microsystems R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 70] INTERNET-DRAFT RBridge Protocol Email: Radia.Perlman@sun.com Silvano Gai Nuova Systems Email: sgai@nuovasystems.com Dinesh G. Dutt Cisco Systems, Inc. 170 Tasman Drive San Jose, CA 95134-1706 Phone: +1-408-527-0955 EMail: ddutt@cisco.com Donald E. Eastlake, 3rd Motorola Laboratories 111 Locke Drive Marlborough, MA 01752 USA Phone: +1-508-786-7554 Email: Donald.Eastlake@motorola.com Expiration and File Name This draft expires in January 2008. Its file name is draft-ietf-trill-rbridge-05.txt. R. Perlman, S. Gai, D. Dutt, D. Eastlake [Page 71]