INTERNET-DRAFT                                              H. Berkowitz
                                                                Geotrain
Expiration Date:  May 1998                                 November 1997


                  To Be Multihomed: Requirements & Definitions
                      draft-berkowitz-multirqmt-00.txt

1. Status of this Memo


   This document is an Internet-Draft. Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups. Note that other groups may also distribute
   working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as ``work in progress.''

   To learn the current status of any Internet-Draft, please check the
   ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
   ftp.isi.edu (US West Coast).

2. Abstract

As organizations find their Internet connectivity increasingly critical
to their mission, they seek ways of making that connectivity more
robust.  The term ''multi-homing'' often is used to describe means of
fault-tolerant connection.  Unfortunately, this term covers a variety of
mechanisms, including naming/directory services, routing, and physical
connectivity.  This memorandum presents a systematic way to define the
requirement for resilience, and a taxonomy for describing mechanisms to
achieve it.  Multiple mechanisms may be appropriate for specific
situations, including DNS, BGP, etc.


3. Introduction

As the Internet becomes more ubiquitous, more and more enterprises
connect to it.  Some of those enterprises, such as Web software vendors,
have no effective business if their connectivity fails.  Other
enterprises do not have mission-critical Internet applications, but
become so dependent on routine email, news, web, and similar access that
a loss of connectivity becomes a crisis.

As this Internet dependence becomes more critical, prudent management
suggests there be no single point of failure that can break all Internet
connectivity.  The term "multihoming" has come into vogue to describe
various means of enterprise-to-service provider connectivity that avoid
a single point of failure. Multihoming also can describe connectivity
between Internet Service Providers and "upstream" Network Service
Providers.

There are other motivations for complex connectivity from enterprises to
the Internet. Mergers and acquisitions, where the joined enterprises
each had their own Internet access, often mean complex connectivity, at
least for a transition period.  Consolidation of separate divisional
networks also creates this situation.  A frequent case arises when a
large enterprise decides that Internet access should be available
corporate-wide, but their research labs have had Internet access for
years -- and it works, as opposed to the new corporate connection that
at best is untried.

Many discussions of multihoming focus on the details of implementation,
using such techniques as the Border Gateway Protocol (BGP) [RFC number
of the Applicability Statement], multiple DNS entries for a server, etc.
This document suggests that it is wise to look systematically at the
requirements before selecting a means of resilient connectivity.

One implementation technique is not appropriate for all requirements.
There are special issues in implementing solutions in the general
Internet, because poor implementations can jeopardize the proper
function of global routing or DNS.  An incorrect BGP route advertisement
injected into the global routing system is a problem whether it
originates in an ISP or in an enterprise.

4. Goals

Requirements tend to be driven by one or more of several major goals for
server availability and performance. Availability goals are realized
with resiliency mechanisms, to avoid user-perceived failures from single
failures in servers, routing systems, or media.  Performance goals are
realized by mechanisms that distribute the workload among multiple
machines such that the load is equalized.

Like multi-homing, the terms load-balancing and load-sharing have many
definitions.   Paul Ferguson defines load-balancing as  "a true "50/50"
sharing of equal paths. This can be done by either (a) round robin per-
packet transmission, (b) binding pipes a the lower layers such that bits
are either 'bit-striped' across all parallel paths (like the
etherchannel stuff), or binding pipes so that SAR functions are done in
a method such as multilink PPP. These are fundamentally the same.

"Load-sharing is quite different. It simply implies that no link is
sitting idle -- that at least all links get utilized in some fashion.
Usually in closest exit routing. The equity of utilization may be
massively skewed. It may also resemble something along the lines of
60/40, which is reasonable."

In defining requirements, the servers themselves may either share or
balance the load, there may be load-sharing or load-balancing routing
paths to them, or the routed traffic may be carried over load-shared or
load-balanced media.

The servers of interest may be inside the enterprise, or outside it.  In
this document, intranet servers are inside the enterprise and intended
primarily for enterprise use.  Multinet servers are inside the
enterprise, but there is pre-authorized access by external partners.
Internet servers are operated by the enterprise but intended to be
accessible to the general Internet.

Intranet clients have access only to machines on the intranet.  Internet
clients have general Internet access that may be mediated by a firewall.

In the terminology of RFC1775, "To be 'on' the Internet," servers
described here have "full" or a subset of client access.   Client
servers may not directly respond to specific IP packet from an arbitrary
host, but a system such as a firewall MUST respond for them unless a
security policy precludes that.  Some valid security policies, for
example, suppress the response of ICMP Destination Administratively
Prohibited responses, because that would reveal there is an information
resource being protected.

RFC1775 defines full access as "  a permanent (full-time) Internet
attachment running  TCP/IP, primarily appropriate for allowing the
Internet community   to access application servers, operated by Internet
service  providers.  Machines with Full access are directly visible to
others attached to the Internet, such as through the Internet Protocol's
ICMP Echo (ping) facility.  The core of the Internet comprises those
machines with Full access." This definition is extended here to allow
full firewalls or screening routers always to be present.

If a proxy or address translation service exists between the real
machine and the Internet, if this service is available on a full-time
basis, and consistently responds to requests sent to a DNS name of the
server, it is considered to be full-time.

In this discussion, we generalize the definition beyond machines
primarily appropriate for the Internet community as a whole, to include
in-house and authorized partner machines that use the Internet for
connectivity.

RFC1775 also defines "client machines," on which the user runs
applications that employ Internet application protocols directly on
their own computer platform, but might not be running underlying
Internet protocols  (TCP/IP), might not have full-time access, such as
through dial-up, or might have constrained access, such as through a
firewall.  When active,  Client users might be visible to the general
Internet, but such visibility cannot be predicted.  For example, this
means that most Client access users will not be detected during an
empirical probing of systems "on" the Internet at any given moment, such
as through the ICMP Echo facility.

4.1 Specific server availability

The first goal involves  well-defined applications that run on specific
servers visible to the Internet at large.  This will be termed "endpoint
multihoming", emphasizing the need for resilience of connectivity to
well-defined endpoints.  Solutions here often involve DNS mechanisms.

There are both availability and performance goals here.  Availability
goals arise when there are multiple routing paths that can reach the
server, protecting it from single routing failures.  Other availability
goals involve replicated servers, so that the client will reach a server
regardless of single server failures.

Performance goals include balancing client requests over multiple
servers, so that one or more servers do not become overloaded and
provide poor service.  Requests can be distributed among servers in a
round-robing fashion, or more sophisticated distribution mechanisms can
be employed.  Such mechanisms can consider actual real-time workload on
the server, routing metric from the client to the server, known server
capacity, etc.

4.2 General Internet connectivity from the enterprise

The second is high availability of general Internet connectivity for
arbitrary enterprise users to the outside. This will be called
"internetwork multihoming".  Solutions here tend to involve routing
mechanisms.

4.3 Use of Internet services to interconnect "intranet" enterprise
    campuses

The third involves the growing number of situations where Internet
services are used to interconnect parts of an enterprise.  This is
"intranetwork multihoming".  This will usually involve dedicated or
virtual circuits, or some sort of tunneling mechanisms.

4.4 Use of Internet services to connect to "multinet" partners

A fourth category involves use of the Internet to connect with strategic
partners.  True, this does deal with endpoints, but the emphasis is
different than the first case.  In the first case, the emphasis is on
connectivity from arbitrary points outside the enterprise to points
within it.  This case deals with pairs of well-known endpoints.

These endpoints may be linked with dedicated or virtual circuits defined
at the physical or data link layer. Tunneling or other virtual private
networks may be relevant here as well.  There will be coordination
issues that do not exist for the third case, where all resources are
under common control.

5. Planning and Budgeting

In each of these scenarios, organization managers need to assign some
economic cost to outages.  Typically, there will be an incident cost and
an incremental cost based on the length or scope of the connectivity
loss.

Ideally, this cost is then weighted by the probability of outage.

A weighted exposure cost results when the outage cost is multiplied by
the probability of the outage.

Resiliency measures modify the probability, but increase the cost of
operation.

Operational costs obviously include the costs of redundant mechanisms
(i.e., the addititional multihomed paths), but also the incremental
costs of personnel to administer the more complex mechanisms -- their
training and salaries.

6. Issues

6.1 Performance vs. Robustness: the Cache Conundrum

Goals of many forms of "multi-homing" conflict with goals of improving
local performance.  For example, DNS queries normally are cached in DNS
servers, and in the requesting host.  From the performance standpoint,
this is a perfectly reasonable thing to do, reducing the need to send
out queries.

>From the multihoming standpoint, it is far less desirable, as
application-level multihoming may be based on rapid changes of the DNS
master files.  The binding of a given IP address to a DNS name can
change rapidly.

6.2 Symmetry

Global Internet routing is not necessarily optimized for best end-to-end
routing, but for efficient handling in the Autonomous Systems along the
path.  Many service providers use "closest exit" routing, where they will
go to the closest exit point from their perspective to get to the next hop
AS.  The return path, however, is not necessarily of a mirror image of the
path from the original source to the destination.  Especially when the
enterprise network has multiple points of attachment to the Internet,
either to a single ISP AS or to multiple ISPs, it becomes likely that the
response to a given packet will not come back at the same entry point in
which it left the enterprise.

This is probably not avoidable, and troubleshooting procedures and traffic
engineering have to consider this characteristic of multi-exit routing.


6.3 Security

ISPs may be reluctant to let user routing advertisements or DNS zone
information flow directly into their routing or naming systems.  Users
should understand that BGP is not intended to be a plug-and-play
mechanism; manual configuration often is considered an important part of
maintaining integrity.  Supplemental mechanisms may be used for
additional control, such as registering policies in a registry [RPS, RA
documents] or egress/ingress filtering [Ferguson draft]

Challenges may arise when client security mechanisms interact with fault
tolerance mechanisms associated with servers.  For example, if a server
address changes to that of a backup server, a stateful packet screening
firewall might not accept a valid return. Similarly, unless servers back
one another up in a full mirroring mode, if one end of a TCP-based
application connection fails, the user will need to reconnect.  As long
as another server is ready to accept that connection, there may not be
major user impact, and the goal of high availability is realized.  High
availability and user transparent high availability are not synonymous.


7. Application/Transport/Name Multihoming

[****Folks -- I am not a DNS expert.  I need help and/or a coauthor
here.  Alternatively, may I suggest someone might want to write a
detailed DNS multihoming RFC that parallels Tony & Yakov's document on
BGP multihoming?]

While many people look at the multihoming problem as one of routing,
various solutions may be based more on DNS than on routing.   The basic
idea here is that arbitrary clients will first request access to a
resource by its DNS name, and certain DNS servers will resolve the same
name to different addresses based on conditions of which DNS is aware,
or using some statistical load-distribution mechanism.

There are some general DNS issues here.  DNS was not really designed to
do this. A key issue is that of DNS cacheing.  Cacheing and frequent
changes in name resolution are opposite goals. Traditional DNS schemes
emphasize performance over resiliency.

[RFC1034] "The meaning of the TTL field is a time limit on how long an
RR can be kept in a cache.  This limit does not apply to authoritative
data in zones; it is also timed out, but by the refreshing policies for
the zone.  The TTL is assigned by the administrator for the zone where
the data originates.  While short TTLs can be used to minimize caching,
and a zero TTL prohibits caching, the realities of Internet performance
suggest that these times should be on the order of days for the typical
host.  If a change can be anticipated, the TTL can be reduced prior to
the change to minimize inconsistency during the change, and then
increased back to its former value following the change"

[discuss limitations/behavior of basic round robin]

Dynamic DNS may be a long-term solution here. In the short term, setting
very short TTL values may be appropriate. Remember that the name
normally is resolved when an application session first is established,
and the decisions are made over a longer time base than per-packet
routing decisions.

7.1 Servers in Multiple Address Spaces

[Kent England] Have you ever had a case where a multi-homed site used
address overlays, one set of addresses from within ISP#1 CIDR block and
another set of addresses from within ISP#2 CIDR block? I would call this
application level multi-homing as opposed to network level multihoming,
with a single set of servers (web, mail, ftp) with overlay addresses
using redundant access paths, controlled via DNS. Seems to me it should
be workable without BGP and allow finer grained load sharing (or
balancing?) than BGP.?

[Paul Vixie If you want to load balance, you can use multiple A records and
it works  until one of the provides goes down. Then, only * get through
(unless
the client is bright about trying all addresses, which some are).


7.2 Coordinated DNS

[This is the Cisco Distributed Director strategy]

7.3 Other methods/software?

8. Network/Routing Multihoming

A common concern of enterprise financial managers is that multihoming
strategies involve expensive links to ISPs, but, in some of these
scenarios, alternate links are used only as backups, idle much of the
time.  Detailed analysis may reveal that the cost of forcing these links
to be used at all times, however, exceeds the potential savings.

The intention here is to focus on requirements rather than specifics of
the routing implementation, several approaches to which are discussed in
RFC1998 and draft-bates-multihoming-01.txt.

Operational as well as technical considerations apply here.  While the
Border Gateway Protocol could convey certain information between user
and provider, many ISPs will be unwilling to risk the operational
integrity of their global routing by making the user network part of
their internal BGP routing systems.

ISPs may also be reluctant to accept BGP advertisements from
organizations that do not have frequent operational experience with this
complex protocol.

8.1 Single-homed (R1)

The enterprise generally does not have its own ASN; all its
advertisements are made through its ISP.   The enterprise uses default
routes to the ISP. The customer is primarily concerned with protecting
against link or router failures, rather than failures in the ISP routing
system.

8.1.1 Single-homed, single-link (R1.1)

There is a single active data link between the customer and provider.
Variations could include switched backup over analog or ISDN services.
Another alternative might be use of alternate frame relay or other PVCs
to an alternate ISP POP.

8.1.2 Single-homed, balanced link (R1.2)

In this configuration, multiple parallel data links exist from a single
customer router to an router.  There is protection against link
failures.

The single customer router constraint allows this router to do round-
robin packet-level load balancing across the multiple links, for
resiliency and possibly additional bandwidth.  The ability of a router
to do such load-balancing is implementation-specific, and may be a
significant drain on the router's processor.

8.1.3 Single-homed, multi-link (R1.3)

Here, we have separate paths from multiple customer routers to multiple
ISP routers at different POPs.  Default routes generated at each of the
customer gateways are injected into the enterprise routing system, and
the combination internal and external metrics are considered by internal
routers in selecting the external gateway.

This often is attractive for enterprises that want resiliency but wish
to avoid the complexity of BGP.

8.1.4 Special Cases

While the customer in this configuration is still single-homed, an AS
upstream from the ISP has a routing policy that makes it necessary to
distinguish routes originating in the customer from those originating in
the ISP.  In such cases, the enterprise may need to run BGP, or have the
ISP run it on its behalf, to generate advertisements of the needed
specificity.  Since the same basic topologies discussed above apply, we
can qualify them as R1.1B, R1.2B, and R1.3B.

It MAY be possible for the customer to avoid using BGP, if its adjacent
ISP will set a BGP community attribute, understood by the upstream, on
the customer prefixes [RFC1998].  Doing so results in the cases R1.1C,
R1.2C, and R1.3C.

8.2 Multi-homed Routing

The enterprise connects to more than one ISP, and desires to protect
against problems in the ISP routing system.  It will accept additional
complexity and router requirements to get this.  The enterprise may also
have differing service agreements for Internet access for different
divisions.

8.2.1 Multi-homed, primary/backup, single link (R2.1)

The enterprise connects to two or more ISPs from a single router, but
has a strict policy that only one ISP at a time will be used for
default.  In an OSPF environment, this would be done by advertising
defaults to both ISPs, but with different Type 2 external metrics.  The
primary ISP would have the lower metric.  BGP is not necessary in this
case.  This easily can be extended to multi-link.

8.2.2 Multi-homed, differing internal policies (R2.2)

In this example, assume OSPF interior routing.  The main default for the
enterprise comes from one or more ASBRs in Area 0, all routing to the
same ISP. One or more organizations brought into the corporate network
have pre-existing Internet access agreements with an ISP other than the
corporate ISP, and wish to continue using this for their "divisional"
Internet access.

This is frequent when a corporation decides to have general Internet
access, but its research arm has long had its own Internet connectivity.
Mergers and acquisitions also produce this case.

In this situation, an additional ASBR(s) are placed in the OSPF areas
associated with the special-case, and this ASBR advertises default.
Filters at the Area Border Router block the divisional ASBR's default
from being advertised into Area 0, and the corporate default from being
advertised into the division.  Note that these filters do not block OSPF
LSAs, but instead block the local propagation of selected default and
external routes into the Routing Information Base (i.e., main routing
table) of a specific router.

8.2.3 Multi-homed, "load shared" with primary/backup (R2.3)

[Thanks to Paul Ferguson for the distinction between load balancing and
load sharing.] While there still is a primary/backup policy, there is an
attempt to make active use of both the primary and backup providers.
The enterprise runs BGP, but does not take full Internet routing.  It
takes partial routing from the backup provider, and prefers the backup
provider path for destinations in the backup provider's AS, and perhaps
directly connected to that AS.  For all other destinations, the primary
provider is the preferred default.  A less preferred default is defined
to the second ISP, but this default is advertised generally only if
connectivity is lost to the primary ISP.

8.2.4 Multi-homed, global routing aware (R2.4)

Multiple customer router receive a full routing table, and, using
appropriate filtering and aggregation, advertise different destinations
(i.e., not just default) internally.  This requires BGP, and, unless
dealing with a limited number of special cases, requires significantly
more resources inside the organization.

8.3 Transit.

While we usually think of this in terms of ISPs, some enterprises may
provide Internet connectivity to strategic partners.  They do not offer
Internet connectivity on a general basis.

8.3.1 Full iBGP mesh (R3.1)

Connectivity and performance requirements are such that a full iBGP mesh
is practical.

8.3.2 Scalable IBGP required (R3.2)

The limits of iBGP full mesh have been reached, and confederations,
route reflectors, etc., are needed for growth.

9. Addressing Refinements and Issues

It is arguable that addressing used to support multihoming is a routing
deployment issue, beyond the scope of this document.  The rationale for
including it here is that addressing MAY affect application behavior.

If the enterprise runs applications that embed network layer addresses
in higher-level data fields, solutions that employ address translation,
at the packet or virtual connection level, MAY NOT work.  Use of such
applications inherently is a requirement for the eventual multihoming
solution.

Consideration also needs to be given to application caches in addition
to those of DNS.  Firewall proxy servers are a good example where
multiple addresses associated with a given destination may not be
supported.

RFC1918 internal, NAT

RFC1918 internal, PAT

Registered internal, Provider Assigned (PA)

Registered internal, Provider Independent (PI)

10. Transmission Considerations in Multihoming

"Multihoming" is not logically complete until all single points of
failure are considered.  With the current emphasis on routing and naming
solutions, the lowly physical layer often is ignored, until a physical
layer failure dooms a lovely and sophisticated routing system.

Physical layer diversity can involve significant cost and delay.
Nevertheless, it should be considered for mission-critical connectivity.

The principal transmission impairment, the backhoe, can be viewed at
http://www.cat.com/products/equip/bhl/bhl.htm

10.1 Local Loop

>From a typical server room, analog and digital signals physically flow
to a wiring closet, where they join a riser cable.  The riser cable
joins with other riser cables in a cable vault, from which a cable
leaves the building and goes to the end switching office of the local
telecommunications provider.  Most buildings have a single cable vault,
possibly with multiple cables following a single physical route back to
the end office.  A single error by construction excavators can cut
multiple cables on a single path.

A failure in carrier systems can isolate a single end office.  Highly
robust systems have physical connectivity to two or more POPs reached
through two or more end offices.

Alternatives here can become creative.  On a campus, it can be feasible
to use some type of existing ductwork to run additional cables to
another building that has a physically diverse path to the end office.
Direct wire burial, fiber optic cables run in the air between buildings,
etc., are all possible.

In a non-campus environment, it is possible, in many urban areas, to
find alternate means of running physical media to other buildings with
alternate paths to end offices.  Electrical power utilities may have
empty ducts which they will lease, and through which privately owned
fiber can be run.
10.2 Provider Core

As demonstrated by a rash of fiber cuts in early 1997, carriers lease
bandwidth from one another, so a cut to one carrier-owned facility may
affect connectivity in several carriers.  This reality makes some
traditional diverse media strategies questionable.

Many organizations consciously obtain WAN connectivity from multiple
carriers, with the notion that a failure in carrier will not affect
another.  This is not a valid assumption.

If the goal is to obtain diversity/resiliency among WAN circuits, it may
be best to deal with a single service provider.  The contract with this
provider should require physical diversity among facilities, so the
provider's engineering staff will be aware of requirements not to put
multiple circuits into the same physical facility, owned by the carrier
or leased from other carriers.

11. Security Considerations


12. Acknowledgments


13. References


[RFC1775]

[RFC1930]

[RFC1034]

[RFC----] BGP-4 Applicability Statement

[RFC1998]

[RFC2071] Ferguson, P., and H. Berkowitz, "Network Renumbering
Overview: Why would I want it and what is it anyway?", RFC 2071,
January 1997.

[RFC2050] Hubbard, K., Kosters, M., Conrad, D., Karrenberg, D., and J.
Postel, "INTERNET REGISTRY IP ALLOCATION GUIDELINES", BCP 12, RFC
2050, November 1996.

[RFC1631] Egevang,, K., and P. Francis, "The IP Network Address
Translator (NAT)", RFC 1631, May 1994.

[RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., de Groot, G-J.,
and E. Lear, "Address Allocation for Private Internets", RFC 1918,
February 1996.

[RFC1900] Carpenter, B., and Y. Rekhter, "Renumbering Needs Work", RFC
1900, February 1996.

[RPS] Alaettinoglu, C., Bates, T., Gerich, E., Terpstra, M., and C.
Villamizer, "Routing Policy Specification Language", Work in Progress.

[RFC1812] Baker, F., "Requirements for IP Version 4 Routers", RFC
1812, June 1995.

14. Author's Address

Howard C. Berkowitz
Geotrain Corporation
  (formerly Protocol Interface & PSC International)
1600 Spring Hill Road, Suite 310
Vienna VA 22182
Phone: +1 703 998 5819
EMail: hcb@clark.net