TRILL WG J. Touch Internet Draft USC/ISI Expires: May 2006 R. Perlman Sun November 18, 2005 The Architecture of an RBridge Solution to TRILL draft-touch-trill-rbridge-arch-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on May 18, 2006. Copyright Notice Copyright (C) The Internet Society (2005). All Rights Reserved. Abstract RBridges are link layer (L2) devices that use routing protocols as a control plane. They combine the link layer ability to allow hosts to reattach without renumbering with network layer routing benefits. RBridges use existing link state routing to provide higher internal Touch Expires May 18, 2006 [Page 1] Internet-Draft RBridge Architecture November 2005 cross-section bandwidth, faster convergence under reconfiguration, and more robustness to link interruption than an equivalent set of conventional bridges using existing spanning tree forwarding. They are intended to apply to similar L2 network sizes as conventional bridges and are intended to be backward compatible with those bridges as both ingress/egress and transit. They also attempt to retain as much 'plug and play' as is already available in existing bridges. This document proposes the RBridge system as a solution to the TRILL problem. It also defines the RBridge architecture, presents its terminology, and describes its basic components and desired behavior. A separate document specifies the protocols and mechanisms that satisfy the architecture presented herein. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [1]. Table of Contents 1. Introduction...................................................3 2. Background.....................................................3 2.1. Existing Terminology......................................4 2.2. RBridge Terminology.......................................5 3. Components.....................................................6 3.1. RBridge Device............................................6 3.2. CFT.......................................................6 3.3. CTT.......................................................7 4. Functional Description.........................................7 4.1. RBridge Campus Autoconfiguration..........................7 4.2. Node Discovery............................................9 4.3. Tunneling.................................................9 4.4. [NOTE - the following sections will be completed before submitting as an ID]..........................................10 4.5. Ingress/Egress Operations................................10 4.6. Forwarding Operations....................................10 4.7. Broadcast................................................10 4.8. Internal Routing Protocol................................10 4.8.1. Determining CFT.....................................10 4.8.2. Determining CTT.....................................10 4.9. External Protocols.......................................10 4.9.1. Outgoing BPDU Interactions..........................10 4.9.2. Incoming BPDU Interactions..........................11 4.9.2.1. Transparent-STP................................11 4.9.2.2. Participate-STP................................11 Touch Expires May 18, 2006 [Page 2] Internet-Draft RBridge Architecture November 2005 4.9.2.3. Block-STP......................................12 5. [ANY OTHER SECTIONS MISSING?].................................12 6. Security Considerations.......................................12 7. IANA Considerations...........................................12 8. Conclusions...................................................12 9. Acknowledgments...............................................13 9.1. Normative References.....................................13 9.2. Informative References...................................13 Author's Addresses...............................................13 Intellectual Property Statement..................................13 Disclaimer of Validity...........................................14 Copyright Statement..............................................14 Acknowledgment...................................................14 1. Introduction This document describes an architecture that addresses the TRILL problem and applicability statement [3]. This architecture is composed of a set of devices called RBridges that coordinate together inside an Ethernet link subnet to create a single, virtual device composed of an overlay of tunnels using link state routing. RBridges thus support increased internal bandwidth and fault tolerance, when compared to conventional Ethernet bridges which forward frames via a single spanning tree, while still being compatible with bridges and hubs. The remainder of this document outlines the architecture of RBridges and describes their components and functions. Note that this document is not intended to represent the only solution to the problem statement, nor does it specify the protocols that instantiate this architecture - or that only one such set of protocols is proscribed. The former would be contained in other architecture documents and the latter would be contained in separate specification documents. 2. Background The RBridge architecture is based on the system described in the Infocom paper of the same name [2]. This paper describes the rbridge system as a specific instance; this document abstracts the architectural features only. The remainder of this section describes the terminology of this document, which may differ from that of the original paper. [NOTE: terminology needs to be checked as being consistent throughout] Touch Expires May 18, 2006 [Page 3] Internet-Draft RBridge Architecture November 2005 2.1. Existing Terminology o 802.1: IEEE Specification for Ethernet, i.e., including hubs and switches. o 802.1D: IEEE Specification for bridged Ethernet, including the BPDUs that manage the spanning tree protocol. o Bridge: an Ethernet (L2, 802.1D) device with multiple point-to- point ports that receives incoming frames on a port and repeats them on some or all of the other ports; bridges support both port learning and BSTP. o Bridge Protocol Data Unit (BPDU): the frame that bridges exchange to coordinate their configuration; used primarily [QUESTION: exclusively?] for BSTP. o Bridge Spanning Tree (BST): an Ethernet (L2, 802.1D) forwarding protocol based on the topology of a spanning tree. o Bridge Spanning Tree Protocol (BSTP): an Ethernet (802.1D) protocol for establishing and maintaining a single spanning tree among all the bridges on a logical segment. o Frame: here refers to an Ethernet (L2) unit of transmission, including the header, data, and trailer. o Hub: an Ethernet (L2, 802.1) device with multiple point-to-point ports which transparently repeats frames that arrive on a port to all other ports o Node: a device with an L2 address that sources and/or sinks L2 packets; nodes can occur outside the RBridge campus or be used as transits within the campus. o Packet: here refers to an IP (L3, RFC791) unit of transmission, including header and data. o Port learning: the process by which a switch or bridge determines on which single outgoing port to copy an incoming frame. o Router: an IP (L3) forwarding device; RBridges cannot span routers. o Segment: an Ethernet link, either a single physical link or emulation thereof (e.g., via hubs) or a logical link or emulation thereof (e.g., via bridges). Touch Expires May 18, 2006 [Page 4] Internet-Draft RBridge Architecture November 2005 o Subnet, Ethernet: a single segment, or a set of segments interconnected by an rbridge campus; in the latter case, the subnet may or may not be equivalent to a single segment. o Switch: an Ethernet (L2, 802.1) device with multiple point-to- point ports which repeats frames that arrive on a port to all or some of other ports; switches include port learning 2.2. RBridge Terminology o Campus (RBridge Campus): a set of rbridges acting in unison o Campus Forwarding Table (CFT): the per-hop forwarding table populated by the rbridge IGP based on lookups of the CTH inside the outermost received L2 header, rather than that L2 header. o Campus Transit Header (CTH): a 'shim' header that encapsulates the ingress L2 frame and persists throughout the transit of a campus, which is further encapsulated a hop-by-hop L2 header/trailer. o Campus Transit Table (CTT): a table that maps ingress L2 destinations to egress RBridge addresses, used to encapsulate ingress frames for transit of the campus. o Designated RBridge: the rbridge associated with ingress and egress traffic to a particular Ethernet link; that rbridge is such a link's 'designated rbridge'. o Edge (of a campus): describes rbridges that transit traffic from outside the campus to inside, and vice-versa. o Egress rbridge: relative to traffic transiting an rbridge campus, the egress rbridge is the rbridge where that external traffic is decapsulated and exits the campus. o External (to a campus): describes communication on the non-rbridge components of an Ethernet link subnet, notably not between rbridges or between non-rbridges and rbridges. o Ingress rbridge: relative to traffic transiting an rbridge campus, the ingress rbridge is the rbridge where that external traffic entered the campus and was encapsulated. o Internal (to a campus): describes communication between rbridge devices; this communication may traverse external devices, but is still considered internal. Touch Expires May 18, 2006 [Page 5] Internet-Draft RBridge Architecture November 2005 o Inside (to a campus): see Internal. o Outside (to a campus): see External. o Rbridge: a single rbridge device which can aggregate with other rbridge devices to create a campus o [AGAIN, AS WITH THE P&AS, IS THE CAMPUS THE SET OF RBRIDGES OR THE LINK SUBNET IN WHICH THE RBRIDGE RESIDES?] 3. Components An RBridge campus is composed of rbridge devices; all other Ethernet link subnet devices, such as bridges, hubs, and nodes, operate conventionally in the presence of an rbridge. 3.1. RBridge Device An rbridge is a bridge-like device that forwards frames on an Ethernet link subnet. It has one or more Ethernet ports which may be wired or wireless; the particular physical layer is not relevant. An rbridge is defined more by its behavior than its structure, although it contains two tables which distinguish it from conventional bridges. Conventional bridges contain a learned port table (LPT) and a spanning tree table (STT). The LPT allows a bridge to avoid broadcasting all received frames, as is required for a hub. The bridge learns which nodes are accessible from a particular port by assuming bidirectionality: the source addresses of incoming frames indicate that the incoming port is to be used as output for frames destined to that address. Incoming frames are checked against the LPT and forwarded to the particular port if a match occurs, otherwise they are broadcast out all ports except the incoming port. The STT indicates the ports used in the spanning tree. [TO BE COMPLETED] [what are the actual names of LPT and STT?] RBridges, by comparison, have a Campus Forwarding Table (CFT) and a Campus Transit Table (CTT), described in the following sections. 3.2. CFT The CFT is a forwarding table for internal traffic, allowing tunneled traffic to transit the campus from ingress to egress. The size of a Touch Expires May 18, 2006 [Page 6] Internet-Draft RBridge Architecture November 2005 fully-populated CFT is bounded by the number of egress rbridges. Rbridges may have separate CFTs for each VLAN, if separate VLANs are supported by manual configuration. The CFT is continually maintained by the internal routing protocol (see Section ??). 3.3. CTT The CTT determines how incoming external traffic will be encapsulated, and indicates the Ethernet link address of the egress of the rbridge campus. The CTT can be considered a version of the LPT that operates across the rbridge campus as a whole. It is configured in much the same way as the LPT: by snooping incoming traffic, and assuming bidirectionality (see Section 4.8.2). The information is learned at the egress rbridge and propagated to all possible ingresses using an internal routing protocol (also Section 4.8.2). The CTT may be populated on-demand or a-priori, and may be as large as the number of nodes on the Ethernet subnet. Rbridges may have separate CTTs for each VLAN, if separate VLANs are supported by manual configuration. 4. Functional Description The architecture of an rbridge is largely defined by its behavior; the physical components are minimal, as outlined in Section 3. The following setions 4.1. RBridge Campus Autoconfiguration RBridges self-organize to compose a single RBridge campus system. Consider first a set of bridges on a single Ethernet link subnet (Figure 1). Here bridges are shown as 'b', hubs as 'h', and nodes as 'N'; bridges and hubs are numbered. Note that the figure does not distinguish between types of nodes, i.e., hosts and routers; both are just nodes at the link layer, and are otherwise indistinguishable. The bridges organize into a single spanning tree, as shown by double lines ('=', '||', and '//') in the figure. Touch Expires May 18, 2006 [Page 7] Internet-Draft RBridge Architecture November 2005 N N---b3---N | || | || N---h1--b4===b5==h2==b6 | // | || | // N || | // || N---b7====b8-----b9-----N | |\ | | \ N N N Figure 1 Conventionally bridged Ethernet link subnet It is useful to notice that hubs are transparent to bridges, both for traffic from nodes to bridges (h1) and for traffic between bridges (h2). Also note that the same hub can support traffic between bridges and from a host to a bridge (h2), but that the spanning tree is exclusively between bridges. Bridges are thus compatible with hubs, both as transits and ingress/egress. An RBridge campus has a similar spirit, and can be viewed as a variant of the way bridges self-organize. Figure 2 shows the same topology where some of the rbridges are replaced by rbridges. In this figure, stars ('*') represent the paths the rbridge is capable of utilizing, due to the use of link state routing. Rbridges can tunnel directly to each other (r4-r5), or through hubs (h2) or bridges (b8). N N---b3---N | || | || N---h1--r4***r5**h2**r6 * * | * * * N * * * * N---r7****b8*****r9-----N | |\ | | \ N N N Figure 2 RBridged Ethernet link subnet Every node in an rbridge is considered to have a primary point of attachment to the rbridge campus, as defined by the designated rbridge. Each Ethernet link segment attached to an rbridge campus has a single designated rbridge; that rbridge is where all traffic that transits the rbridge enters and exits. In Figure 2, it is easy to see Touch Expires May 18, 2006 [Page 8] Internet-Draft RBridge Architecture November 2005 that the nodes off of h1 must attach at r4; the nodes off of b3, however, attach at either r5 or r6, depending on which is the designated rbridge. Without loss of generality, an rbridge topology can be reorganized (ignoring link length) such that all nodes, hubs, and bridges are arranged around the periphery, and all rbridges are considered directly connected by their tunnels (Figure 3). Note that this does ignore the ways in which hubs and bridges may serve both on the ingress/egress and for transit, so it is not used for traffic analysis. This is a more convenient functional view because it is easier to see the 'inside' and 'outside' of the campus clearly. The devices can easily distinguish between inside and outside traffic on shared devices, such as h2 and b8, because internal traffic content is hidden from these devices by the tunnel encapsulation header. N N---b3---N | || | || | h2 | /| \ | / N \ | / \ N---h1--r4***r5******r6 * * * * * * * * * N---r7***********r9-----N \ /|\ \ / | \ \ / N N \ / \ / b8 | N Figure 3 Reorganized RBridge Ethernet link subnet 4.2. Node Discovery [THIS SECTION TO BE COMPLETED BEFORE SUBMISSION AS AN ID] 4.3. Tunneling Rbridges exchange internal traffic with each other over tunnels. These tunnels use an Ethernet link layer header, together with a shim Touch Expires May 18, 2006 [Page 9] Internet-Draft RBridge Architecture November 2005 header; it is the combination of these headers that distinguishes interior traffic from exterior. The link header includes source and destination addresses, which typically identify the ingress and egress rbridges. For incoming multicast and broadcast traffic, one of these addresses may represent the multicast group or broadcast address. Additionally, these addresses may be VLAN-specific, i.e., such that each ingress and egress address have per-VLAN addresses. The additional shim header is required to support loop prevention for internal traffic; external traffic loops are prohibited by the spanning trees that remain on the ingress/egress links. This shim header may record the rbridge transit route, a hopcount, or a timestamp to prevent indefinite looping of a frame. This shim header should clearly identify the traffic as rbridge, i.e., the outer Ethernet header should use a protocol number unique to rbridges. 4.4. [NOTE - the following sections will be completed before submitting as an ID] 4.5. Ingress/Egress Operations 4.6. Forwarding Operations 4.7. Broadcast 4.8. Internal Routing Protocol 4.8.1. Determining CFT 4.8.2. Determining CTT 4.9. External Protocols The rbridge campus may participate in Ethernet link protocols, notably the spanning tree protocol (STP) on the ingress/egress links. There are three variants; it is anticipated that only one of these variants would be supported by an instance of the rbridge architecture. All rbridges within a single campus must use the same protocol for interacting with external protocols. 4.9.1. Outgoing BPDU Interactions All three protocols describe reactions of an rbridge campus to incoming BPDUs, but do not preclude preemptive emission of a campus of BPDUs to the external segments. Such BPDUs might indicate that the each designated rbridge is the root of its corresponding external Touch Expires May 18, 2006 [Page 10] Internet-Draft RBridge Architecture November 2005 segment's spanning tree, which may be necessary for proper overall configuration. 4.9.2. Incoming BPDU Interactions There are three ways in which an rbridge campus may interact with incoming BPDUs. The first two of these, Transparent-STP and Participate-STP, cause the rbridge campus to emulate the behavior of known Ethernet devices. In the last variant, Block-STP, the campus does not emulate a known Ethernet device. Block-STP may have substantial benefits on the stability of the spanning trees on external links, but may also require that rbridge be the only Ethernet device system permitted this privilege - as enforced at the protocol level. 4.9.2.1. Transparent-STP An rbridge campus may internally broadcast spanning tree messages (BPDUs) arriving at designated rbridges, emitting one copy on each egress link. Such an rbridge campus is said to support 'Transparent- STP', and that campus emulates a single hub connected to each link at the designated rbridge. Because hubs are compatible with bridges running STP, a transparent-STP campus is similarly compatible. Transparent-STP reduces the complexity of the spanning tree in an Ethernet link subnet because rbridges do not directly participate in the spanning tree. It still requires that BPDUs be broadcast throughout the rbridge campus, which can cause the external spanning tree protocol to be delayed until the rbridge campus is configured. The cost of these broadcasts can be reduced by use of an efficient internal routing protocol (e.g., supporting broadcast), but the cost is higher than in unicast (e.g., Participate-STP) and blocking (e.g., Block-STP) schemes. 4.9.2.2. Participate-STP An rbridge campus may interpret BPDUs received at its ingress rbridges and emit new BPDUs at its egress rbridges to reflect its internal default spanning tree. A campus is expected to have one or more internal spanning trees, e.g., to use for broadcast traffic, but these trees are computed by the internal routing protocol rather than STP. Participate-STP allows this internal spanning tree to be spliced to the spanning trees on external segments. A participate-STP campus emulates a single bridge device, just as transparent-STP emulates a single hub device. Participate-STP is similarly compatible with existing bridges and hubs, although the Touch Expires May 18, 2006 [Page 11] Internet-Draft RBridge Architecture November 2005 resulting Ethernet subnet spanning tree may be different. Here too, as with transparent-STP, the benefit to spanning tree scalability lies in the (presumably) more efficient and stable computation of the internal spanning tree using the internal routing protocol. Here too, the external spanning trees are affected by waiting for the internal spanning tree to be computed. In this case, there are concerns about the 'chicken and egg' problem; the external spanning trees need to configure before they can transit traffic, notably traffic between rbridges used to exchange the internal routing protocol - which itself is used to determine the internal spanning tree. This case must be addressed in the protocol if this variant is selected. 4.9.2.3. Block-STP An bridge campus may completely block BPDUs received at its ingress 5. How RRbridges Address TRILL [this section should go through the TRILL requirements; the TRILL P&AS should have a list in the conclusions that we can check-off.] 6. [ANY OTHER SECTIONS MISSING?] 7. Security Considerations Add any security considerations [didn't get around to this YET] 8. IANA Considerations This document has no direct IANA considerations. It does suggest, in Section ??, that protocols that instantiate the rbridge architecture use a shim header as a wrapper on the payload to internal traffic. This shim header should be identified by a new protocol type in the tunneled Ethernet link header. This protocol, as an identifier in an 802 header, should probably be allocated by the IEEE and coordinated with IANA. 9. Conclusions [TO BE COMPLETED] Touch Expires May 18, 2006 [Page 12] Internet-Draft RBridge Architecture November 2005 10. Acknowledgments [TO BE COMPLETED] 10.1. Normative References [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. 10.2. Informative References [2] Perlman, R., "RBridges: Transparent Routing", Proc. Infocom 2005, March 2004. [3] Touch, J., (ed.) "Transparent Interconnection of Lots of Links (TRILL): Problem and Applicability Statement," draft-touch-trill-prob-00.txt, Nov. 17, 2005. Author's Addresses Joe Touch USC/ISI 4676 Admiralty Way Marina del Rey, CA 90292-6695 U.S.A. Phone: +1 (310) 448-9151 Email: touch@isi.edu URL: http://www.isi.edu/touch Radia Perlman Sun Microsystems Email: Radia.Perlman@sun.com Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Touch Expires May 18, 2006 [Page 13] Internet-Draft RBridge Architecture November 2005 Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Touch Expires May 18, 2006 [Page 14]