Internet DRAFT - draft-xu-l2vpn-vpls-isis
draft-xu-l2vpn-vpls-isis
Network working group X. Xu
Internet Draft Huawei Technologies
Category: Standard Track H. Shah
Ciena Corp
Expires: September 2012 March 9, 2012
Virtual Private LAN Service (VPLS) Using IS-IS
draft-xu-l2vpn-vpls-isis-03
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with
the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on September 9, 2012.
Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document.
Xu & Shah Expires September 9, 2012 [Page 1]
Internet-Draft VPLS Using IS-IS March 2012
Abstract
This document describes a light-weight Virtual Private LAN Service
(VPLS), referred to as IS-IS VPLS, which uses IS-IS for auto-
discovery and signaling. IS-IS VPLS is intended to be used as a
scalable cloud data center network solution.
Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC-2119 [RFC2119].
Table of Contents
1. Introduction ................................................ 3
2. Terminology ................................................. 5
3. Control Plane ............................................... 5
3.1. VPLS Info TLV .......................................... 6
3.2. Auto-discovery ......................................... 7
3.3. Signaling .............................................. 7
4. Data Plane .................................................. 7
4.1. Data Encapsulation and Forwarding ...................... 7
4.1.1. Unicast ........................................... 8
4.1.2. Multicast/Broadcast ............................... 8
4.2. MAC Address Learning ................................... 9
4.2.1. Data-plane based MAC Learning ..................... 9
4.2.2. Control-plane based MAC Learning .................. 9
5. ARP Broadcast Reduction .................................... 10
6. Security Considerations .................................... 10
7. IANA Considerations ........................................ 10
8. Acknowledgements ........................................... 10
9. References ................................................. 10
9.1. Normative References .................................. 10
9.2. Informative References ................................ 10
Authors' Addresses ............................................ 11
Xu & Shah Expires September 9, 2012 [Page 2]
Internet-Draft VPLS Using IS-IS March 2012
1. Introduction
For leveraging the economics of scale, today's cloud data centers
tend to contain tens to hundreds of thousands of servers. Moreover,
distributed computing and server virtualization are increasingly
adopted in today's cloud data centers. These factors lead to
significant scaling, performance, and operational challenges for
cloud data center networks. Therefore, to design a scalable and
sustainable cloud data center network solution, the following
requirements must be taken into account:
1) LAN Extension
To achieve service agility and business continuance, Virtual
Machine (VM) migration and High-Available (HA) cluster are
commonly used in today's cloud data centers. These two
applications introduce special requirements on data center
networks. For instance, to allow a VM to be freely migrated from
one server to another within data centers while retaining its IP
address and MAC address, the LAN where the VM is located needs to
be extend across multiple racks or pods within data centers. In
addition, as some HA cluster applications rely on link-local
multicast for cluster member discovery and heartbeat, cluster
member servers are usually required to reside within the same
Layer2 domain. As a result, LAN extension becomes a fundamental
requirement for cloud data center networks.
2) VPN Space Scalability
In modern cloud data centers, tens of thousands of tenants could
be hosted over a shared network infrastructure. For security and
performance isolation considerations, these tenants SHOULD be
isolated from one another. Hence, cloud data center networks
SHOULD provide a large enough VPN space for tenant isolation.
3) Forwarding Table Scalability
In a highly virtualized cloud data center environment, it's not
uncommon that millions of VMs are contained over a common network
infrastructure. Therefore, the forwarding table of each network
device within cloud data center networks SHOULD be scalable
enough so as to keep up with that scale of VMs.
4) Bandwidth Utilization Maximization
Xu & Shah Expires September 9, 2012 [Page 3]
Internet-Draft VPLS Using IS-IS March 2012
In modern cloud data centers, distributed computing is driving the
server-to-sever traffic to become the dominating traffic compared
to the client-to-server traffic. To meet the growing capacity
demands for server-to-server connectivity, shortest path
forwarding and Equal Cost Multi-Path (ECMP) have already been the
basic capabilities of cloud data center networks.
5) ARP/Unknown Unicast Flood Suppression
It's well-known that the flooding of ARP broadcast and unknown
unicast traffic within large Layer2 networks will lead to
performance impact on both networks and hosts. As the Layer2
domain is extended across multiple racks or pods within data
centers, the above problem will become even worse. As such, how
to suppress the flooding of ARP broadcast and unknown unicast
traffic within cloud data centers becomes increasingly desirable.
6) Flexibility for Tradeoffs between Bandwidth and State
It's possible that there would be a great difference between
tenants within a cloud data center, no matter in the number of VMs
or in the volume of multicast/broadcast traffic. As such, there is
no "one size fits all" VPN multicast/broadcast delivery procedure
for these tenants. In other words, cloud data center networks
SHOULD support both the ingress replication procedure and the
multicast tree procedure for delivering VPN multicast/broadcast
traffic, so as to allow for an effective tradeoff between
bandwidth usage and state maintenance on a per tenant basis
according to the particular conditions of each tenant.
7) Simplified Provisioning and Operation
It's not surprising that a single cloud data center has thousands
of physical switches (e.g., ToR switches). However, a network of
such scale usually implies a big challenge for data center
operators. Therefore, how to simplify and even automate network
provisioning and operation becomes significantly important for
cloud data center networks.
8) Reuse Existing Operation Experiences
IP, as a proven routing technology, has already been used in most
today's data centers. Furthermore, those service providers who
are planning to build cloud data centers have years of
experiences in operating MPLS/IP VPN networks. To allow data
center operators to reuse their existing network operation
experiences, cloud data center network solutions SHOULD reuse
Xu & Shah Expires September 9, 2012 [Page 4]
Internet-Draft VPLS Using IS-IS March 2012
existing technologies and protocols where appropriate, rather
than reinventing the wheel. That's why there are increasing
interests from the industrial community on how to adapt the
existing L2VPN technologies for cloud data center networks.
Although the existing VPLS solutions [RFC4761][RFC4762] can meet
most of the above requirements, there are still spaces for
improvement. For instance, the existing VPLS solutions require
establishing full-mesh of pseudo-wires (PWs) between PE routers,
which implies a significant scaling challenge in the cloud data
center environment, especially when imagining configuring thousands
of ToR switches as PE routers and provisioning tens of thousands of
VPLS instances on them; secondly, the ingress replication procedure
used for delivering multicast and broadcast traffic in existing VPLS
solutions is not optimal from the bandwidth utilization perspective;
thirdly, existing VPLS solutions require running one or more
separate protocols besides IGP within data centers for VPLS protocol
(e.g., LDP and/or BGP), which results in a certain complexity in
network management and operation, especially when considering
configuring BGP and VPLS parameters on thousands of ToR switches and
configuring thousands of BGP peers on aggregation or core switches.
Hence, this document describes a light-weight VPLS solution,
referred to as IS-IS VPLS, which uses IS-IS [IS-IS][RFC1195] for
VPLS protocol. IS-IS VPLS retains almost all benefits of existing
VPLS solutions while overcoming their shortages as mentioned above.
For example, there is no need for full-mesh PWs between PE routers
in IS-IS VPLS. Furthermore, it allows data center operators to
flexibly make tradeoffs between bandwidth and state on a per tenant
basis. Finally, an already deployed IGP within cloud data centers
(i.e., IS-IS), rather than a dedicated protocol(s), is used for VPLS
protocol.
2. Terminology
This memo makes use of the terms defined in [RFC4664], [VPLS-MCAST],
[RFC4761] and [RFC4762].
3. Control Plane
There are two primary functions of the VPLS control plane: auto-
discovery and signaling. In IS-IS VPLS, these two functions are
accomplished by using a single extended IS-IS TLV, referred to as
VPLS Info TLV (see section 3.1). By propagating such VPLS Info TLVs
that contain VPLS information within data center networks, PE
routers automatically discover which other PE routers are part of a
given VPLS instance and their assigned VPLS labels for that VPLS
Xu & Shah Expires September 9, 2012 [Page 5]
Internet-Draft VPLS Using IS-IS March 2012
instance. Furthermore, P routers don't need processing the VPLS
information contained in that IS-IS TLV, but instead synchronizing
the Link State PDUs (LSP) with their adjacent IS-IS neighbors as
normal.
3.1. VPLS Info TLV
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+
|Type=VPLS Info |
+-+-+-+-+-+-+-+-+
| Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| PE's IPv4 or IPv6 Address |
| (128 bits) |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Resv (12 bits) | VPLS ID (20 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Resv (12 bits) | VPLS Label (20 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Resv (12 bits) | VPLS ID (20 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Resv (12 bits) | VPLS Label (20 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type
Type code for VPLS Info TLV: TBD.
Length
Total number of bytes contained in the value field.
PE's IPv4 or IPv6 Address
This 128-bit field is filled with one of the originating PE
router's IPv4 or IPv6 addresses which are reachable across
the IP backbone. The address filled in this field SHOULD be
used as a tunnel destination address by remote PE routers
when sending a customer packet to such PE router. If the IP
Xu & Shah Expires September 9, 2012 [Page 6]
Internet-Draft VPLS Using IS-IS March 2012
address is IPv4, the last four octets of this field are
filled with the IPv4 address while the remaining part is set
to zero. In other words, it is filled with an IPv4-mapped
IPv6 address.
VPLS ID
This field is filled with a 20-bit globally unique VPLS ID
for a particular attached VPLS instance. In case that a
larger VPLS ID space is required, the leftmost 12-bit
reserved field could be used together with the VPLS ID field
as an extended VPLS ID field. That is to say, the whole 32
bits are filled with a 32-bit long extended VPLS ID value.
VPLS Label
This field is filled with a 20-bit MPLS label corresponding
to the VPLS instance which is identified by the VPLS ID.
3.2. Auto-discovery
In IS-IS VPLS, each PE router could automatically discover which
other PE routers are part of a given VPLS instance that is
identified by the globally unique VPLS ID. This allows each PE
router to be configured with only the identities of the attached
VPLS instances, but not identities of all the other PE routers
belonging to these VPLS instances.
3.3. Signaling
In IS-IS VPLS, a PE router would assign the same VPLS label for a
given VPLS instance to any other PE router. As such, this VPLS label
is only used for identifying a particular VPLS instance, rather than
identifying both a particular VPLS instance and the corresponding
ingress PE router as a PW label does.
4. Data Plane
4.1. Data Encapsulation and Forwarding
Since the VPLS label in IS-IS VPLS is only used for identifying a
particular VPLS instance, in the data-plane based MAC learning case
(see section 4.2.1), IP-based tunnel (e.g., GRE (Generic Routing
Encapsulation)) is RECOMMENDED to be used as the PE-to-PE tunnel
technology. As such, during the MAC learning process, egress PE
router could easily determine the ingress PE router of the received
VPLS packet from the tunnel source address of that packet. Note that,
Xu & Shah Expires September 9, 2012 [Page 7]
Internet-Draft VPLS Using IS-IS March 2012
in the control-plane based MAC learning case (see section 4.2.2),
there is no special requirement for PE-to-PE tunnel technology in
comparison with existing VPLS solutions. The following sub-sections
are based on an assumption that IP-based tunnels are used between PE
routers.
4.1.1. Unicast
For known unicast, MAC-in-MPLS-in-IP encapsulation [RFC4448][RFC4023]
is used. For unknown unicast, the encapsulation and forwarding
procedures are the same as that for multicast/broadcast described in
the following section.
4.1.2. Multicast/Broadcast
There are two major modes for delivering multicast and broadcast in
IS-IS VPLS: ingress replication mode and P-Multicast tree mode. P-
Multicast tree mode further includes two sub-options: non-
aggregative P-Multicast tree mode where one P-Multicast distribution
tree in the IP backbone is exclusively used by a single VPLS
instance, and aggregative P-Multicast tree mode in which one P-
Multicast tree is shared by more than one VPLS instance. The
corresponding encapsulation for each mode is described in the
following sub-sections.
4.1.2.1. Ingress Replication Mode
In the ingress replication mode, an ingress PE router forward the
received customer multicast/broadcast frames towards remote PE
routers in separate tunnels. Hence, the encapsulation in this mode
has no difference from that for unicast.
4.1.2.2. Non-aggregative P-Multicast Tree Mode
In the non-aggregative P-Multicast tree mode, MAC-in-IP
encapsulation is used directly since the destination IP address
(i.e., multicast address) contained in the IP-based tunnel header is
enough for egress PE routers to determine which VPLS instance the
received VPLS packet belongs to.
4.1.2.3. Aggregative P-Multicast Tree Mode
For the aggregative P-Multicast tree mode, MAC-in-MPLS-in-IP
encapsulation SHOULD be used. Furthermore, the MPLS label here
SHOULD be treated as an upstream-assigned label. For example, assume
a PE router has assigned a local label L for a given VPLS instance
and advertised that VPLS information using the VPLS Info TLV before,
Xu & Shah Expires September 9, 2012 [Page 8]
Internet-Draft VPLS Using IS-IS March 2012
when this PE router wants to send a multicast VPLS packet of that
VPLS instance through the corresponding aggregative P-Multicast tree,
label L as an upstream-assigned label will be contained in that VPLS
packet.
4.2. MAC Address Learning
MAC addresses of local CE hosts would still be learnt by PE routers
as normal bridges.
As for learning MAC addresses of remote CE hosts, IS-IS VPLS
provides two options: data-plane based MAC learning and control-
plane based MAC learning. If unknown unicast flood suppression is
required even at the cost of consuming more forwarding table
resources, the control-plane based MAC learning option could be
considered. Otherwise, the data-plane based MAC learning option
SHOULD be preferred.
4.2.1. Data-plane based MAC Learning
Upon receiving an VPLS packet from a remote PE router, the MPLS
label contained in the packet (or the tunnel destination IP address
in the non-aggregative P-Multicast tree case) is used to determine
the particular VPLS instance that the packet belongs to, while the
tunnel source IP address is used to tell from which ingress PE
router the packet was sent.
4.2.2. Control-plane based MAC Learning
In IS-IS VPLS, MAC addresses of remote CE hosts can also be learnt
on the control plane by using the MAC-Reachability TLV defined in
[RFC6165].
Upon learning the MAC addresses of their local CE hosts, PE routers
would immediately advertise these MAC addresses to other PE routers
of the same VPN instance by using the MAC-Reachability TLV defined
in [RFC6165]. One or more MAC-Reachability TLVs are carried in a LSP
which in turn is encapsulated with an Ethernet header. The source
MAC address is the originating PE router's MAC address whereas the
destination MAC address is a to-be-defined multicast MAC address
specifically identifying IS-IS VPLS PE routers. Such LSPs are
forwarded towards remote PE routers as customer packets by ingress
PE routers. Egress PE routers receiving the above packets SHOULD
intercept them and accordingly process them. IP address of the PE
router originating these MAC routes could be derived either from the
"IP Interface Address" field contained in the corresponding LSPs
(Note that the IP address here SHOULD be identical with that
Xu & Shah Expires September 9, 2012 [Page 9]
Internet-Draft VPLS Using IS-IS March 2012
contained in the VPLS Info TLV) or from the tunnel source IP address
of the VPLS packet containing such MAC routes.
Since these LSPs are fully transparent to P routers, there is no
impact on the control plane of P routers. More details about the
control-plane based MAC learning procedure are for further study.
5. ARP Broadcast Reduction
To suppress ARP broadcast flood within a given VPLS instance, ARP
cache mechanism can be enabled on PE routers. For more details about
ARP cache mechanism, please refer to [ARP-Reduction].
6. Security Considerations
This document doesn't introduce additional security risk to IS-IS
and VPLS, nor does it provide any additional security feature for
IS-IS and VPLS.
7. IANA Considerations
The IS-IS TLV type code for VPLS Info TLV is required to be defined
by IANA.
8. Acknowledgements
Thanks to Tony Li, Peter Ashwood-Smith, Phil Bedard, Kris Price,
Shahram Davari, Adrian Farrel, Giles Heron and other people for
their valuable comments on this proposal.
9. References
9.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
9.2. Informative References
[IS-IS] ISO/IEC 10589, "Intermediate System to Intermediate System
Intra-Domain Routing Exchange Protocol for use in
Conjunction with the Protocol for Providing the
Connectionless-mode Network Service (ISO 8473)", 2005.
[RFC1195] Callon, R., "Use of OSI IS-IS for Routing in TCP/IP and
Dual Environments", 1990.
Xu & Shah Expires September 9, 2012 [Page 10]
Internet-Draft VPLS Using IS-IS March 2012
[RFC4448] Martini, L., Rosen, E., El-Aawar, N., and G. Heron,
"Encapsulation Methods for Transport of Ethernet over MPLS
Networks", RFC 4448, April 2006.
[RFC4023] Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating
MPLS in IP or Generic Routing Encapsulation (GRE)", RFC
4023, March 2005.
[RFC6165] A. Banerjee., D. Ward, "Extensions to IS-IS for Layer-2
Systems", RFC6165, February 2011.
[VPLS-MCAST] R. Aggarwal., Y. Kamite., L. Fang, "Multicast in VPLS",
draft-ietf-l2vpn-vpls-mcast-08.txt (work in progress),
October 2010.
[ARP-Reduction] H. Shah., A. Ghanwani., and N. Bitar, "ARP Broadcast
Reduction for Large Data Centers",
draft-shah-armd-arp-reduction-02.txt (work in progress),
October 2011.
[RFC5331] R. Aggarwal, Y. Rekhter, E. Rosen, "MPLS Upstream Label
Assignment and Context-Specific Label Space", RFC 5331,
August 2008.
[RFC4664] Andersson, L. and Rosen, E. (Editors),"Framework for Layer
2 Virtual Private Networks (L2VPNs)", RFC 4664, Sept 2006.
[RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN Service
(VPLS) Using BGP for Auto-Discovery and Signaling", RFC
4761, January 2007.
Authors' Addresses
Xiaohu Xu
Huawei Technologies,
Beijing, China
Email: xuxiaohu@huawei.com
Himanshu Shah
Ciena Corp
Email: hshah@ciena.com
Xu & Shah Expires September 9, 2012 [Page 11]