INTERNET-DRAFT Mingui Zhang Intended Status: Informational Huawei Expires: December 30, 2011 June 28, 2011 To Address the Space Limitation of Inner VLAN draft-zhang-trill-vlan-extension-01.txt Abstract TRILL is originally designed for customer networks. When it is used as a provider network to interconnect multiple separate users, the space of current Inner VLAN becomes insufficient. As an easy way, a new VLAN ID field is added to the Inner Ethernet frame. The new VLAN ID can be use in conjunction with the previous defined Inner VLAN. This mechanism is usually known as stacked VLANs(802.1ad, or QinQ). The stacked VLANs mechanism is analyzed in this document. The method to introduce a single longer VLAN field to current TRILL header is also analyzed in this document. In addition, a novel MAC address rewriting mechanism is also brought forward to break the space limitation of the Inner VLAN. The MAC addresses of the native frame are not used for forwarding process in transit RBridges. When the packet enters the TRILL network, the ingress RBridge (Appointed Forwarder) uses a Virtual MAC (VMAC) address to replace the real MAC and embedded the native VLAN ID into this virtual MAC address. When a packet egresses from the TRILL network, the egress RBridge will use the real MAC address to replace the virtual MAC address and deliver it to the destination on the local link. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Mingui Zhang Expires December 30, 2011 [Page 1] INTERNET-DRAFT VLAN Extension June 28, 2011 Copyright and License Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 2. VLAN Stacking . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Single Longer VLAN Field . . . . . . . . . . . . . . . . . . . . 6 4. MAC Rewriting . . . . . . . . . . . . . . . . . . . . . . . . . 6 4.1. Protocol Extension . . . . . . . . . . . . . . . . . . . . 7 4.2. Packet Forwarding Scenarios . . . . . . . . . . . . . . . . 7 4.3. Host Mobility . . . . . . . . . . . . . . . . . . . . . . 10 4.4. Comparison . . . . . . . . . . . . . . . . . . . . . . . 10 5. Security Considerations . . . . . . . . . . . . . . . . . . . 12 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12 7.1. Normative References . . . . . . . . . . . . . . . . . . 12 7.2. Informative References . . . . . . . . . . . . . . . . . 12 Author's Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 Mingui Zhang Expires December 30, 2011 [Page 2] INTERNET-DRAFT VLAN Extension June 28, 2011 1. Introduction When multiple independent users run their own VLANs on the same TRILL network, TRILL has to configure one VLAN for each customer. However, the single Inner VLAN tag of TRILL can only identify 4094 VLANs which is not enough for multi-tenant scenarios. To address this scalability issue, the first way that comes to our mind is to add a new VLAN tag to TRILL header. The newly added VLAN tag and the original Inner VLAN tag forms the stacked VLANs (QinQ or IEEE 802.1ad) which are widely used in provider bridging networks. IEEE 802.1ad requires minimal cooperation between each customer and the service provider [802.1ad]. As a mature Ethernet networking standard, QinQ is to be easily accepted as the way of VLAN extension for TRILL. The second way to extend the VLAN space limitation is to include a new field with more bits than current Inner VLAN. The MAC rewriting technique analyzed in this draft is the third way for TRILL to break the above limitation. It does not add extra bits to the frame, therefore the possibility of fragmentation and reassemble is lower than the other two methods. Traditional switches use the destination MAC address to look up the outgoing interface to the next hop. RBridges work in a different way. The MAC addresses are not used when transit RBridges are forwarding the frames across the TRILL network. In the MAC address rewriting method, a ingress RBridge writes the original VLAN tag into the MAC address field of the original frame; the egress RBridge write the MAC address back and deliver the packet to its destination host. The formerly Inner VLAN defined by TRILL will be used as the second VLAN tag in the stacked VLAN tags. This method does not change the header of TRILL and the transit RBridges need not be upgraded to support a provider network for multiple tenants environment. 1.1. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Mingui Zhang Expires December 30, 2011 [Page 3] INTERNET-DRAFT VLAN Extension June 28, 2011 2. VLAN Stacking +--------+ +--------+ +--------+ | Etype | | Etype | | Etype | +--------+ +--------+ +--------+ |CVLAN 10| |SVLAN 50| |CVLAN 10| +--------+ +--------+ +--------+ . | Etype | . . +--------+ . @@@@@@@@@@@@@@@@@ |CVLAN 10| @@@@@@@@@@@@@@@@@ @[USER1,VLAN 10]@ +--------+ @[USER1,VLAN 10]@ @@@@@@@@@@@@@@@@@ . @@@@@@@@@@@@@@@@@ @ . @ (0)------+ +-------+ +------(2) | | | | | | | RB1 (2)==(0) RB2 (1)==(0) RB3 | | | | | | | (1)------+ +-------+ +------(1) # . # ################# . ################# #[USER2,VLAN 10]# +--------+ #[USER2,VLAN 10]# ################# | Etype | ################# . +--------+ . . |SVLAN 60| . +--------+ +--------+ +--------+ | Etype | | Etype | | Etype | +--------+ +--------+ +--------+ |CVLAN 10| |CVLAN 10| |CVLAN 10| +--------+ +--------+ +--------+ Figure 2.1: An example of stacked VLANs used for multiple tenants scenario Take Figure 2.1 as an example, traffic from two different users is to be carried on the same TRILL network. Both of them choose the same customer VLAN 10. In order to segregate their traffic, the edge RBridge can assign a single, unique SVLAN ID for each user. User1 gets SVLAN 50 and user2 gets SVLAN 51. +-----------+-----+-----+---------+ |Inner.MacDA|SVLAN|CVLAN| Egress | +-----------+-----+-----+---------+ | x | 50 | 10 | RB3 | +-----------+-----+-----+---------+ | y | 51 | 10 |local/if1| +-----------+-----+-----+---------+ ...... Figure 2.2: The end station learned by RB1 Mingui Zhang Expires December 30, 2011 [Page 4] INTERNET-DRAFT VLAN Extension June 28, 2011 However, the stacked VLANs mechanism (also known as QinQ) defined in [802.1ad] is currently not supported by TRILL. The native Inner Ethernet frame format is limited to 802.1q [TRILLbase]. When the limitation is relaxed to include the frame format of 802.1ad, the tagging function (to push or pop the S-tag) is required to be added to edge RBridges. On the data plane, the ingress RBridge is required to be able to use S-tag and C-tag together to determine the egress RBridge for a known unicast frame. A hypothetical example of a MAC table is given in Figure 2.2 where RB1 may look up the destination MAC and the stacked VLAN IDs to determine the egress RBridge or local egress port. The first entry is for the end node from user1 network attacked to RB3 while the second entry is for the end node from user2 attached to RB1. When the known unicast frame arrives at the egress RBridge, it will be delivered to its destination which belongs to the VLAN identified by S-tag and C-tag. The MAC addresses will not be processed by transit RBridges which overcomes the drawback of QinQ which bears a scalability issue within the core of the carrier network, where every core switch needs to learn and maintain forwarding entries for every customer MAC address [HVALN]. +-----+-----+-----+---------+ |DT-ID|SVLAN|CVLAN|OUT/IF(s)| +-----+-----+-----+---------+ | RB1 | 50 | 10 | 0, 1 | +-----+-----+-----+---------+ | RB1 | 51 | 10 | 0, 1 | +-----+-----+-----+---------+ Figure 2.3: The distribution tree table of RB2 Transit RBridges for known unicast frames are VLAN unaware unless VLAN mapping is intentionally performed. For multicast frames, even the transit RBridges need to examine the value of S-tag and C-tag since the pruning of the distribution tree is VLAN-based. Figure 2.3 shows the distribution tree table of the transit RBridge RB2. The SVLAN and CVLAN IDs are read from the Inner Ethernet frame. The root (Distribution Tree IDentifier, DT-ID) of the distribution tree is identified by the egress nickname of the multicast native frame. Here, we suppose the distribution tree is RB1-RB2-RB3 and RB1 is the root. On the control plane, all the operations that are single VLAN tag based before should be changed into stacked VLANs based now. These operations include the multicast pruning, the end station address learning, the forwarder appointment, the VLAN mapping and so on. If QinQ is adopted, the TRILL header is retained. The native Inner Mingui Zhang Expires December 30, 2011 [Page 5] INTERNET-DRAFT VLAN Extension June 28, 2011 Ethernet frame has more bits than before. This increases the possibility of generating oversize frames in TRILL which may cause fragmentation of the PDUs and loss of TRILL data frames. 3. Single Longer VLAN Field +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TVLAN ID (24 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3.1: The TRILL VLAN identifier (TVLAN ID, TVID) The most immediate way to extend the VLAN space is to open up a longer field for VLAN identification. No matter what the user is, how many VLANs each user has, just assign a unique TVLAN ID (TVID) to each of these VLANs. As shown in Figure 3.1, a flat VLAN space is created instead of a hierarchical space in VLAN stacking. Different to the Q-tag, priority does not appear in this field. With TVLAN, TRILL needs to segregate all services (VLANs) from all the customers together. When two different tenants use the same VLAN ID, the edge RBridges ought to map them to two different TVIDs. For the scenario in Figure 2.1, VLAN 10 for User1 is mapped to TVID 50 while VLAN 10 for User2 is mapped to TVID 60. By this way, the overlapped VLAN IDs are separated. Core RBridges forward frames based on the TRILL header only while the VLAN ID from the customers will be concealed to the core network. The main benefit is that the core TVLAN tagging can be manipulated independently, therefore customer Ethernet native frames are carried transparently and securely by the core RBridge network. Similar to the VLAN stacking solution, in order to extend the numbering space of VLAN, the solution introduced here increases a bit overhead to the frame efficiency. With VLAN stacking, each customer controls its own VLAN numbering space which is independent of the space of other customers and that of the provider network. Although the hierarchical numbering space is the same as that of the flat numbering method, the practical numbering spaces are quite different. VLAN stacking can support 4094 users but most of the users have less than 4094 VLANs. Unused VLAN number for one user can not be used by another, which greatly reduces the numbering space compared to the flat numbering method. 4. MAC Rewriting The inner MAC address of the Ethernet native frame is not used by transit RBbridges, which creates the opportunity to revise the MAC field to deliver the VLAN tag. When a native frame enters the core Mingui Zhang Expires December 30, 2011 [Page 6] INTERNET-DRAFT VLAN Extension June 28, 2011 RBridge network, its source MAC addresses will be rewritten by the edge RBridges to include a VLAN ID. The rewritten MAC is named as VMAC (Virtual MAC). When the TRILL frame with source VMAC egresses from the core RBridge network and arrives at the destination. The destination end station will learn this source VMAC and use it for data forwarding destined to the sender. An IP address to MAC address resolution will be replaced with an IP address to VMAC address resolution in the MAC rewriting mechanism. When the frame destined to the VMAC egresses from the core RBridge network, the VMAC will be mapped back to its real MAC by the edge RBridge who will take into charge of forwarding the frame to its destination on the local link. This mechanism avoid the revision of TRILL header and introduction of new Inner Ethernet native frame format. The ASICs of RBridges does not need to change much. Edge RBridges need to be configured to realize the MAC<->VMAC mapping. There are two possible ways to realize the mapping between MAC and VMAC: A. sequentially numbering the hosts according to its bootstrapping time, set up a mapping database; B. a hash of the hosts' original MAC address, e.g., directly use the lower 3 bytes to fill in the lower 3 bytes of VMAC The bootstrapping becomes different: Previously: Bootstrapping means the storage of a host's MAC; Now: Bootstrapping means two steps: 1. Assign a VMAC to the real MAC of the host. 2. Store the attached host's VMAC 4.1. Protocol Extension A 24-bit Raw VLAN ID field is used to break the VLAN space limitation in TRILL networks. This field is named as Native VLAN ID field. The Inner VLAN ID originally defined by TRILL will be used us the second VLAN ID in the stacked VLANs mechanism. It is recommended that 3 bytes are used for native VLAN IDs and 3 bytes are used for host identification. This partition can be tuned if necessary. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Native VLAN ID (3 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Inner Destination Mapped MAC Address (3 bytes)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3.1: The Virtual Inner Destination MAC address (dubbed as VMAC) 4.2. Packet Forwarding Scenarios Mingui Zhang Expires December 30, 2011 [Page 7] INTERNET-DRAFT VLAN Extension June 28, 2011 (DA=VMAC_b) (DA=VMAC_b) (DA=VMAC_b) (DA=MAC_b)* | | | | | V V | V +-----+ +-----+ V [Host a]------>| RB1 |==============>| RB2 |------>[Host b] ^ +-----+ +-----+ ^ | ^ ^ | | | | | (SA=MAC_a) (SA=VMAC_a)* (SA=VMAC_a) (SA=VMAC_a) Figure 3.2: A unicast packet forwarding +-----------+-------+---------+ |Inner.MacDA|VLAN ID|Egress RB| +-----------+-------+---------+ | Hb.MAC | 10 | RB2 | +-----------+-------+---------+ Figure 4.4: Edge RBridge MAC table +-+-----+-----------+-------+---------+ |T|IN/IF| Egress RB |VLAN ID|OUT/IF(s)| +-+-----+-----------+-------+---------+ |U| if/1| RB2 | 20001 | if/2 | +-+-----+-----------+-------+---------+ Figure 4.4: Transit RBridge forwarding table Suppose the source MAC address of a is MAC_a. Host a looks up the ARP cache and find that MAC address of the destination is VMAC_b. The frame will be send to RB1 which is a's ingress RBridge. RB1 will replace the Inner.MacSA with VMAC_a according to its local mapping system. This Inner.MacSA will finally be learned by RB2 which is the egress RBridge. When RB2 receives the frame with VMACs, the VMAC_b will be replace with MAC_b which is the real MAC address of host b. Of course, the above egress RBridge has already announced the VMAC to the above ingress RBridge before hand. Therefore the ingress RBridge will directly use the announced VMAC to forwarding the frame. In fact, the ingress RBridge does not know the real MAC or the destination host, since ARP will translate IP to VMAC instead the real MAC. By default, for unicast packets, the VLAN ID in the source VAMC and the Destination VMAC should be the same. Mingui Zhang Expires December 30, 2011 [Page 8] INTERNET-DRAFT VLAN Extension June 28, 2011 (DA=BCast) (DA=VMCast)* (DA=VMCast) (DA=VMCast) | | | | | V V | V +-----+ +-----+ V [Host a]------>| RB1 |==============>| RB2 |------>[Host b] ^ +-----+ +-----+ ^ | ^ ^ | | | | | (SA=MAC_a) (SA=VMAC_a)* (SA=VMAC_a) (SA=VMAC_a) Figure 3.3: A broadcast packet forwarding from Host a to Host b Sometimes, a host need to send out broadcast frames, such as ARP query messages and ARP miss frames. The Inner.MacDA of broadcast frames are FF-FF-FF-FF-FF-FF. In Figure 2.2, the broadcast process is depicted. This Inner.MacDA will be replaced with 20001:FF-FF-FF and will be multicasted across the TRILL network according to the base protocol of TRILL [TRILLbase]. Suppose Inner.MacSA is EE-EE-EE-EE-EE- 01. It will be replaced with 20001:EE-EE-01 at the ingress RBridge. Host b will learn host a's VMAC (20001:EE-EE-01) when it receives this broadcast frame and insert it into its ARP cache. (DA=VMAC_b) (DA=VMAC_b) (DA=VMAC_b) (DA=MAC_b)* | | | | | V /\/\/\/\/\/\ V | V +-----+ / \ +-----+ V [Host a]------>| RB1 |< >| RBx |------>[Host b] ^ +-----+ \ / +-----+ ^ | ^ \/\/\/\/\/\/ ^ | | | | | (SA=MAC_a) (SA=VMAC_a)* (SA=VMAC_a) (SA=VMAC_a) Figure 3.4: A unknown unicast packet forwarding from Host a to Host b For unknown unicast frame, the forwarding process is the same as known unicast only that RB1 has to multicast the unknown frame on to the TRILL network. Suppose host b is bootstrapped to RBx. RBx will finally egress the unknown unicast frame. Mingui Zhang Expires December 30, 2011 [Page 9] INTERNET-DRAFT VLAN Extension June 28, 2011 4.3. Host Mobility [Ha] :::::> [Ha] | | +-----+ +-----+ | RB1 | | RB2 | +--+--+ +--+--+ | +-----+ | +--| RBx |--+ +-----+ | [Hx] Figure 3.3: Host a moves from RB1 to RB2. When a host is moved to another port of RBridge or moved to a port of another RBridge, the VMAC used for layer2 data forwarding is changed since the new home RBridge may assign a different VMAC for this host. The black-hole issue is now replaced with an ARP outdated issue. There are several ways to solve this problem: 1. The host immediately sends out gratuitous ARP to bootstrap to the new home RBridge and update the ARP cache of the remote host Hx; New home RBridge notify other RBridges the the new VMAC and other RBridges will send the frame to the new home RBridge; 2. The old home RB1 need query all other RBridges through multicast using the manufacturer-assigned MAC address of Host a to check whether any had seen this before. The one (RB2) that has seen Host a will reply to the old home RB1. Before handover completes, old home RB1 take in charge of forwarding to RB2 and RB2 forward the packet to the host. After the handover finishes, the RB1 will send withdraw to null the address in RBx. 3.If the host has moved to another position, the home RBridge will send out the received packets to this host as unknown unicast packets with the VMAC be replaced with broadcast VMAC. This action continues until this home switch received the new location of the moved host. This mapping will be send out when the moved host finished the bootstrapping after movement. This can be achieved through gratuitous ARP. Gratuitous ARP is restricted in the local link. The new home switch will send out this mapping to outdate the old home switch. It can spread this bootstrapping information using TRILL ESADI mechanism. 4.4. Comparison The edge RBridges need to do mapping between original MAC and VMAC. The transit RBridges do not need to process the VMAC field. Only the Mingui Zhang Expires December 30, 2011 [Page 10] INTERNET-DRAFT VLAN Extension June 28, 2011 RBridges need to read the MAC field to get the VLAN ID. Can work together with the VLAN stacking solution or the Single Longer VLAN Field solution. MAC rewriting does not increase the possibility of fragmentation and reassembling since no additional field is introduced into TRILL header nor the Inner Native frame, which increases the frame encapsulation efficiency compared to the above two methods. Mingui Zhang Expires December 30, 2011 [Page 11] INTERNET-DRAFT VLAN Extension June 28, 2011 5. Security Considerations This document does not introduce additional security issues to TRILL networks. 6. IANA Considerations This document requires no IANA actions. 7. References 7.1. Normative References [TRILLBase] R. Perlman, D. Eastlake, D.G. Dutt, S. Gai and A. Ghanwani, "RBridges: Base Protocol Specification", draft-ietf-trill-rbridge-protocol-16.txt, working in progress. [802-1ad] "IEEE Standard for Local and Metropolitan Area Networks - Virtual Bridged Local Area Networks - Amendment 4: Provider Bridges", P802.1ad/D6.0, August 17, 2005. 7.2. Informative References none Author's Addresses Mingui Zhang Huawei Technologies Co.,Ltd HuaWei Building, No.3 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District, Beijing, 100085 P.R. China Email: zhangmingui@huawei.com Mingui Zhang Expires December 30, 2011 [Page 12]