Internet DRAFT - draft-lamaster-mix
draft-lamaster-mix
Mbone Deployment Working Group Hugh LaMaster
INTERNET-DRAFT Steve Shultz
Category: Informational NASA ARC/NREN
draft-lamaster-mix-00.txt John Meylor
Operations and Management Area David Meyer
Internet Engineering Task Force Cisco Systems
12 November 1998
Expires May 1999
Multicast-Friendly Internet Exchange (MIX)
<draft-lamaster-mix-00.txt
Status of this Memo
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as ``work in progress.''
To learn the current status of any Internet-Draft, please check the
``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
ftp.isi.edu (US West Coast).
Abstract
This document describes an architecture for a Multicast-friendly
Internet eXchange (MIX), and the actual implementation at the NASA
Ames Research Center Federal Internet eXchange (FIX-West, or FIX).
The MIX has three objectives: native IP multicast routing, scalable
interdomain policy-based route exchange, and to allow a variety of
LaMaster, et al. [Page 1]
<draft-lamaster-mix-01.txt November 1998
IGP protocols and topologies for intra-domain use. In support of
these objectives, the MIX architecture defines the following
components: a peer-peer routing protocol, a method for multicast
forwarding, a method for exchanging information about active sources,
and a medium which provides native multicast. This document
describes the protocols and configurations necessary to provide a
current, working multicast-friendly internet exchange, or MIX.
This memo is a product of the MBONE Deployment Working Group (MBONED)
in the Operations and Management Area of the Internet Engineering
Task Force. Submit comments to <mboned@ns.uoregon.edu or the
authors.
Copyright Notice
Copyright (C) The Internet Society (1998). All Rights Reserved.
Acknowledgments
Thanks to the NASA HPCC program for supporting the NREN staff portion
of this project; thanks to William P. Jones of the NASA ARC Gateway
Facility for making the gateway facility available for housing this
project.
1. Introduction
The MIX objective was to use current technology to implement a
scalable, high-performance, efficient, native IP multicast
architecture.
Past experience at ARC, NASA WANs, and at FIX-West, had shown that
mrouted/DVMRP "Mbone" tunnels were an inefficient of routing
multicast through an exchange point. Specifically, at FIX-West, the
large number of tunnels often resulted in unicast traffic loads on
LaMaster, et al. [Page 2]
<draft-lamaster-mix-01.txt November 1998
the FIX FDDI that were 10 times the underlying multicast load. In
addition, some WANs had multiple tunnels criss-crossing the same
physical links, resulting in wasted WAN bandwidth. And, the separate
workstation and router infrastructure for the "Mbone" tunnels created
numerous problems. Maintenance of Unix system and tunnel
configurations was often ad hoc, because some of the network
operators lacked the necessary expertise. And the hardware and
software configuration and performance of the tunnel infrastructure
was often out of step with the underlying router-based unicast
structure. In addition, use of a single, shared, distance-vector IGP
in the inter-domain space led to instability.
Therefore, it was desired to implement a new multicast internet
exchange from the ground up, using current technology, and
significantly improving performance, efficiency, and reliability.
Four elements were identified as being necessary for the MIX
architecture in order to meet the objectives. These were to define a
peer-peer routing protocol, a method for multicast forwarding, a
method for exchanging information about distant sources and groups,
and a non-switched broadcast medium.
NASA Ames Research Center hosts the Federal Internet eXchange (FIX-
West, or, "the FIX") as well as hosting the Ames Internet eXchange
(AIX), which is connected at high speed to the MAE-West, and, which
also shares the same address space as the MAE-West. These facilities
are co-located at the Ames Telecommunications Gateway Facility. It
was felt that this would be an excellent location to test the
viability of the native multicast technologies. The Multicast-
friendly Internet eXchange (MIX) is co-located adjacent to the FIX
for easy access from the existing FIX routers.
Choices were made for each element, and the MIX was implemented
adjacent to the existing NASA ARC FIX gateway facility. At the time
of writing, there are eight direct participants in the MIX, peering
and exchanging routes and multicast traffic natively, and the
performance and reliability have already far exceeded the tunneled
infrastructure the MIX replaced.
2. Requirements and Technology
In order to meet the objectives for this multicast exchange, all
LaMaster, et al. [Page 3]
<draft-lamaster-mix-01.txt November 1998
peering partners had to agree mutually to standardize on the
following four elements. These are:
- the protocol to be used for multicast route exchange
- the method for performing multicast forwarding
- the method for identifying active sources
- the physical medium for the multicast exchange
The elements chosen to implement the MIX were BGP4+ (also known as
"MBGP") for routing and route exchange [BGP4+], PIM-DM and PIM-SM for
multicast forwarding on the exchange, dense-mode flooding, and, the
MSDP protocol for information on sources and groups, and, FDDI for
the multicast medium.
2.1 Routing
Two of the objectives of the MIX were to provide an EGP for scalable
interdomain policy-based route exchange, and to allow a variety of
IGP protocols and topologies for intra-domain use. As with unicast
interdomain routing, BGP could be used as the EGP to exchange routes
for multicast. However, the unicast and multicast routing paths and
policies would have to be completely congruent. In practice, this is
sometimes not the case. It is possible, however, to take advantage
of the extensions in BGP4+ to deal with these policy and path
incongruencies.
BGP4+ [BGP4+] describes extensions to (unicast) BGP that allow use of
the existing BGP machinery to provide the necessary scalability,
policy control, and route stability features and mechanisms to be
applied to both unicast and multicast routes consistently.
BGP4+ allows routes to be marked "unicast forwarding", "multicast
forwarding", or "both unicast and multicast forwarding". In this
way, BGP4+ supports different multicast and unicast forwarding paths
and policies. This removes the dependency on unicast-only routing.
The ability of BGP4+ to support separate paths and policies for
multicast is important for meeting the objectives of the exchange in
various ways. It allows for a participant's multicast routing policy
to be independent of its established unicast routing policy. This is
important in order that the exchange can support providers migrating
to BGP4+ as an IDMR. This is because it allows for the exchange of
routes previously exchanged via DVMRP, even though those routes would
not meet the existing unicast routing policy. It allows for
LaMaster, et al. [Page 4]
<draft-lamaster-mix-01.txt November 1998
different policy in the interim. For example, routes may be
exchanged for BGP4+ multicast forwarding even though they would not
be permitted under existing unicast routing policy. BGP4+ also
provides for the possibility that even after full migration is
complete, a separate multicast routing policy can be applied.
The exchange architecture imposes no requirements on the IGP or the
multicast forwarding protocol or topology used internal to an AS.
2.2 Multicast Forwarding
The first requirement for the multicast forwarding protocol is that
it be able to use routes exchanged via BGP4+. For this reason, PIM
was selected. For the MIX, PIM-Dense-Mode (PIM-DM) was selected
initially for the mutually agreed upon multicast forwarding process.
By flooding data using PIM-DM, it was possible to provide information
about active sources to PIM-SM RP's co-located on the MIX. Migration
to PIM-Sparse-Mode (PIM-SM) with MSDP is underway.
The use of PIM on a shared LAN has certain consequences. It is
necessary for all MIX participants to agree on certain configuration
conventions affecting PIM forwarding on multi-access LANs. In
particular, it is necessary to establish a standard protocol "metric
preference" (also known as "distance" or process "precedence") to be
used by all peers for the PIM Assert process, because the PIM Assert
process [PIM-SM] uses the "metric preference" [PIM-SM] as a mechanism
by which the multicast forwarder is chosen. If all parties are not
following the convention, there may be black holes, in which a route
appears to be valid, but traffic does not flow, or, there may be
multicast loops, which can have deleterious consequences.
For the MIX, a standard set of metric preferences are applied to the
BGP4+ routes as the convention for the PIM forwarding mechanism.
2.3 Active Sources
There are two current methods for distributing information about
active sources to participating AS's. The AS's may be dense-mode
regions, or, they may contain PIM-SM RP's. One method is to use
dense-mode to flood data packets to dense-mode regions and to
sparse-mode RPs co-located on the exchange. The second method is to
LaMaster, et al. [Page 5]
<draft-lamaster-mix-01.txt November 1998
use a protocol that allows each AS to share information about the
sources contained within it.
For the MIX, it was decided use dense-mode, and, all participating
sparse-mode peers would co-locate their RP's on the router directly-
connected to the MIX.
Dense-mode, including PIM-DM, and (mrouted-based) DVMRP, uses data
flooding to propagate information about active source-group or <S,G
pairs throughout the global multicast routing world. Unwanted sources
are pruned back, and are periodically re-flooded in order to fully
refresh forwarding state in mrouters. This is a simple and very
reliable method of propagating information on source-group pairs, but
the effectiveness of dense-mode depends upon reliable pruning, and
flooding traffic to propagate <S,G information over WANs does not
scale well.
Recently, a new protocol, MSDP [MSDP] has been proposed that, when
combined with PIM-SM, will allow independent AS's to share
information about distant sources and groups without flooding.
Instead of flooding all data, only <S,G information is flooded, and
then, only to systems, such as PIM-SM RP's, which require the
information. MSDP allows each AS to choose its own mode, sparse or
dense, and also to run its own sparse-mode region independent of all
other sparse-mode regions.
MSDP has now been deployed on many of the MIX routers, and some MIX-
connected AS's are now running sparse-mode internally. This
deployment is ongoing, and is not yet complete.
2.4 Medium
The objective for the MIX medium was to provide support for native
multicast among multiple peering partners.
There exist a number of unresolved issues regarding use of layer-2
switched media at interexchange points, and, until these issues are
resolved, running native multicast on such media is problematic.
Fortunately, BGP4+ permits unicast and multicast to be carried on
different media, permitting a multicast medium to be used
independently of the unicast medium.
A FDDI concentrator was selected to provide the native multicast
exchange medium. It was router-efficient, because it permitted the
medium to do the multicast packet replication, with a single copy
LaMaster, et al. [Page 6]
<draft-lamaster-mix-01.txt November 1998
from a router being replicated to all neighbors. Using a simple
broadcast medium eliminates the complexity of using a switch for
multicast. And FDDI was considered operationally convenient by most
of the participants. Unicast traffic continues to be routed over the
existing unicast exchange media.
3. The NASA Ames Research Center Multicast-Friendly Internet
Exchange
The Ames Multicast-friendly Internet eXchange, or MIX, began with the
first beta-test trials in March 1998, and became operational,
exchanging BGP4+ routes externally and using BGP4+ between multiple
AS's, in May 1998. NREN implemented BGP4+ and internal BGP4+ and
began trial external peerings in the same time frame, evolving from
the first trials, to full deployment by October. As of October 1998,
there were 8 AS's peering using BGP4+ and actively exchanging
multicast on the MIX FDDI. One of the AS's, AS10888, represents a
multi-router virtual BGP4+ backbone, and a router within AS10888 has
been located on the MIX by NREN, as a gateway router. The physical
and logical topologies are as follows:
AS10888---R----"MBone"
|
MIX | multicast_exchange
----------------
/ \
/ \
bgp4+_peer---R R---bgp4+_peer
\ /
\ /
---------------
FIX unicast_exchanges
AS10888 acts as a transit AS to connect other multicast-friendly
exchanges to the NASA ARC MIX. It also acts as a gateway between
the DVMRP-based "Mbone" and the BGP4+ area.
4. Topology, Architecture, and Special Considerations
LaMaster, et al. [Page 7]
<draft-lamaster-mix-01.txt November 1998
BGP4+
-PIM Asserts and Metric preference
The PIM Assert mechanism requires that all routing protocols
"compete" to see which router is allowed for forward onto the
shared medium. To first order, the protocol metric preference
is used to determine the forwarder. All MIX peers must coordinate
routing protocol parameters so that one router does not inadvertantly
win PIM asserts over a neighbor which has a functional path.
This requires that BGP4+ routes have preference over other
routes, such as BGP, OSPF, and DVMRP. In particular,
it was necessary to standardize protocol metric preferences,
and give BGP4+ routes the lowest, preferred, dynamic routing
protocol metric preferences. For this reason, the standard
set of BGP4+ metric preferences was chosen to be less than any
other dynamic unicast routing protocol metric preferences.
Any MIX routers which are using DVMRP must use a DVMRP metric
preference higher than the BGP4+ metric preferences, rather than
what many people have used previously as the DVMRP metric preference,
of 0.
-Default
One transitional requirement is the necessity to have routes
to "Mbone" sources, that is, sources within the global DVMRP
routing region. Currently, the mechanism used is to have
a single router in AS10888 on the MIX originate MBGP default
to all external peers.
DVMRP routing
-DVMRP route redistribution
At present, all BGP4+ routes tagged with a particular community
are redistributed at the MIX into DVMRP within AS10888. This is
to provide DVMRP region users access to sources originating
within AS's that are being routed via BGP4+ exclusively.
Unless a particular community string is set, it is
assumed that redistribution is not desired. In the reverse
direction, instead of sending DVMRP routes into BGP4+,
BGP4+ default is originated from the intermediary router.
In addition, local, stub-region DVMRP routes are redistributed
into BGP4+ internally by several of the peers. As long as the
regions remain stub regions, there is no danger, but, the
possibility of a backdoor into the Mbone presents an ever-present
LaMaster, et al. [Page 8]
<draft-lamaster-mix-01.txt November 1998
threat of loops unless care is taken to redistribute
only the routes which are known to be owned within the AS.
5. Conclusions and Recommendations
-Provide support for native multicast
-Use BGP4+ as a method of exchanging routes for
inter-domain multicast
-Use PIM-DM, or PIM-SM with MSDP
-Concurrent use of BGP4+ and DVMRP for inter-domain
routing is not recommended. It is strongly
recommended to use BGP4+ for inter-domain route exchange.
6. Security Considerations
There are no security considerations unique to the multicast exchange.
7. References
[DVMRP] T. Pusateri, "Distance Vector Multicast Routing
Protocol", <draft-ietf-idmr-dvmrp-v3-07.txt,
August 1998.
[BGP4+] T. Bates, R. Chandra, D. Katz, Y. Rekhter,
"Multiprotocol Extensions for BGP-4", RFC 2283,
February 1998.
[BGP4+2] T. Bates, R. Chandra, D. Katz, Y. Rekhter,
"Multiprotocol Extensions for BGP-4", Internet Draft,
<draft-ietf-idr-bgp4-multiprotocol-v2-01.txt,
August 1998.
[PIM-SM] D. Estrin, D. Farinacci, A. Helmy, D. Thaler, S. Deering,
M. Handley, V. Jacobson, C. Liu, P. Sharma, L. Wei,
"Protocol Independent Multicast-Sparse Mode (PIM-SM):
Protocol Specification", RFC 2362, June 1998.
[PIM-DM] S. Deering, D. Estrin, D. Farinacci, V. Jacobson, A. Helmy,
D. Meyer, L. Wei, "Protocol Independent Multicast
Version 2 Dense Mode Specification", Internet Draft,
<draft-ietf-pim-v2-dm-01.txt, November 1998.
[MSDP] D. Farinacci, Y. Rekhter, P. Lothberg, H. Kilmer, J. Hall,
LaMaster, et al. [Page 9]
<draft-lamaster-mix-01.txt November 1998
"Multicast Source Discovery Protocol (MSDP)",
<draft-farinacci-msdp-00.txt, June 1998.
Author's Address
Hugh LaMaster
Steve Shultz
NASA Ames Research Center
Mail Stop 233-21
Moffett Field, CA 94035-1000
email: hlamaster@arc.nasa.gov
shultz@arc.nasa.gov
David Meyer
John Meylor
Cisco Systems
San Jose, CA
email: dmm@cisco.com
jmeylor@cisco.com
8. Full Copyright Statement
Copyright (C) The Internet Society (1998). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it or
assist in its implmentation may be prepared, copied, published and
distributed, in whole or in part, without restriction of any kind,
provided that the above copyright notice and this paragraph are included
on all such copies and derivative works. However, this document itself
may not be modified in any way, such as by removing the copyright notice
or references to the Internet Society or other Internet organizations,
except as needed for the purpose of developing Internet standards in
which case the procedures for copyrights defined in the Internet
languages other than English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an "AS
IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK
FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT
LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT
INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR
FITNESS FOR A PARTICULAR PURPOSE."
LaMaster, et al. [Page 10]
<draft-lamaster-mix-01.txt November 1998
Table of Contents
1 Introduction .................................................... 2
2 Requirements and Technology ..................................... 3
3 The NASA Ames MIX ............................................... 7
4 Topology, Architecture, and Special Considerations .............. 7
5 Conclusions and Recommendations ................................. 9
6 Security Considerations ......................................... 9
7 References ...................................................... 9
8 Full Copyright Statement ........................................ 10
LaMaster, et al. [Page 11]