Internet DRAFT - draft-campbell-dime-load-considerations

draft-campbell-dime-load-considerations







Internet Engineering Task Force                              B. Campbell
Internet-Draft                                           S. Donovan, Ed.
Intended status: Informational                                    Oracle
Expires: September 7, 2015                                   JJ. Trottin
                                                          Alcatel-Lucent
                                                           March 6, 2015


       Architectural Considerations for Diameter Load Information
               draft-campbell-dime-load-considerations-01

Abstract

   RFC 7068 describes requirements for Overload Control in Diameter.
   This includes a requirement to allow Diameter nodes to send "load"
   information, even when the node is not overloaded.  The Diameter
   Overload Information Conveyance (DOIC) solution describes a mechanism
   meeting most of the requirements, but does not currently include the
   ability to send load information.  This document explores some
   architectural considerations for a mechanism to send Diameter load
   information.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on September 7, 2015.

Copyright Notice

   Copyright (c) 2015 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents



Campbell, et al.        Expires September 7, 2015               [Page 1]

Internet-Draft              Abbreviated Title                 March 2015


   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Differences between Load and Overload information . . . . . .   3
   3.  How is Load Information Used? . . . . . . . . . . . . . . . .   4
   4.  Piggy-Backing vs a Dedicated Application. . . . . . . . . . .   5
   5.  Which Nodes Exchange Load Information?  . . . . . . . . . . .   6
   6.  Scope of Load Information . . . . . . . . . . . . . . . . . .   7
   7.  Frequency of Sending Load Information . . . . . . . . . . . .   8
   8.  Load Information Semantics  . . . . . . . . . . . . . . . . .   9
   9.  Is Negotiation of Support Needed? . . . . . . . . . . . . . .  10
   10. Topology Scenarios  . . . . . . . . . . . . . . . . . . . . .  10
     10.1.  No Agent . . . . . . . . . . . . . . . . . . . . . . . .  11
     10.2.  Single Agent . . . . . . . . . . . . . . . . . . . . . .  11
     10.3.  Multiple Agents  . . . . . . . . . . . . . . . . . . . .  11
     10.4.  Linked Agents  . . . . . . . . . . . . . . . . . . . . .  12
     10.5.  Shared Server Pools  . . . . . . . . . . . . . . . . . .  13
     10.6.  Agent Chains . . . . . . . . . . . . . . . . . . . . . .  14
     10.7.  Fully Meshed Layers  . . . . . . . . . . . . . . . . . .  14
     10.8.  Partitions . . . . . . . . . . . . . . . . . . . . . . .  15
     10.9.  Active-Standby Nodes . . . . . . . . . . . . . . . . . .  15
     10.10. Addition and removal of Nodes  . . . . . . . . . . . . .  15
   11. Security Considerations . . . . . . . . . . . . . . . . . . .  15
   12. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  16
   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  16
     13.1.  Normative References . . . . . . . . . . . . . . . . . .  16
     13.2.  Informative References . . . . . . . . . . . . . . . . .  16
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  17

1.  Introduction

   [RFC7068] describes requirements for Overload Control in Diameter
   [RFC6733].  At the time of this writing, the DIME working group is
   working on the Diameter Overload Information Conveyance (DOIC)
   mechanism [I-D.ietf-dime-ovli] .  As currently specified, DOIC
   fulfills some, but not all, of the requirements.

   In particular, DOIC does not fulfill Req 24, which requires a
   mechanism where Diameter nodes can indicate their current load, even
   if they are not currently overloaded.  DOIC also does not fulfill Req
   23, which requires that nodes that divert traffic away from




Campbell, et al.        Expires September 7, 2015               [Page 2]

Internet-Draft              Abbreviated Title                 March 2015


   overloaded nodes be provided with sufficient information to select
   targets that are most likely to have sufficient capacity.

   There are several other requirements in RFC 7068 that mention both
   overload and load information that are only partially fulfilled by
   DOIC.

   The DIME working group explicitly chose not to fulfill these
   requirements in DOIC due to several reasons.  A principal reason was
   that the working group did not agree on a general approach for
   conveying load information.  It chose to progress the rest of DOIC,
   and defer load information conveyance to a DOIC extension or a
   separate mechanism.

   This document describes some high level architectural decisions that
   the working group will need to consider in order to solve the load-
   related requirements from RFC 7068.

   At the time of this writing, there have been several attempts to
   create mechanisms for conveyance of both load and overload control
   information that were not adopted by the DIME working group.  While
   these drafts are not expected to progress, they may be instructive
   when considering these decisions.

   o  [I-D.tschofenig-dime-dlba] proposed a dedicated Diameter
      application for exchanging load balancing information.

   o  [I-D.roach-dime-overload-ctrl] described a strictly peer-to-peer
      exchange of both load and overload information in new AVPs piggy-
      backed on existing Diameter messages.

   o  [I-D.korhonen-dime-ovl] described a dedicated Diameter application
      for exchanging both load and overload information.

2.  Differences between Load and Overload information

   Previous discussions of how to solve the load-related requirements in
   [RFC7068] have shown that people do not have an agreed-upon concept
   of how "load" information differs from "overload" information.  The
   two concepts are highly interrelated, and so far the working group
   has not defined a bright line between what constitutes load
   information and what constitutes overload information.

   In the opinion of the authors, there are two primary differences.
   First, a Diameter node always has a load.  At any given time that
   load maybe effectively zero, effectively fully loaded, or somewhere
   in between.  In contrast, overload is an exceptional condition.  A
   node only has overload information when it in an overloaded state.



Campbell, et al.        Expires September 7, 2015               [Page 3]

Internet-Draft              Abbreviated Title                 March 2015


   Furthermore, the relationship between a node's load level and
   overload state at any given time may be vague.  For example, a node
   may normally operate at a "fully loaded" level, but still not be
   considered overloaded.  Another node may declare itself to be
   "overloaded" even though it might not be fully "loaded".

   Second, Overload information, in the form of a DOIC Overload Report
   (OLR) [I-D.ietf-dime-ovli] indicates an explicit request for action
   on the part of the reacting node.  That is, the OLR requests that the
   reacting node reduce the offered load -- the actual traffic sent to
   the reporting node after overload abatement and routing decisions are
   made -- by an indicated amount or to an indicated level.
   Effectively, DOIC provides a contract between the reporting node and
   the reacting node.

   In contrast, load is informational.  That is, load information can be
   considered a hint to the recipient node.  That node may use the load
   information for load balancing purposes, as an input to certain
   overload abatement techniques, to make inferences about the
   likelihood that the sending node becomes overloaded in the immediate
   future, or for other purposes.

   None of this prevents a Diameter node from deciding to reduce the
   offered load based on load information.  The fundamental difference
   is that an overload report requires that reduction.  It is also
   reasonable for a Diameter node to decide to increase the offered load
   based on load information.

3.  How is Load Information Used?

   [RFC7068] contemplates two primary uses for load information.  Req 23
   discusses how load information might be used when performing
   diversion as an overload abatement technique, as described in
   [I-D.ietf-dime-ovli].  When a reacting node diverts traffic away from
   an overloaded node, it needs load information for the other
   candidates for that traffic in order to effectively load balance the
   diverted load between potential candidates.  Otherwise, diversion has
   a greater potential to drive other nodes into overload.

   Req 24 discusses how Diameter load information might be used when no
   overload condition currently exists.  Diameter nodes can use the load
   information to make decisions to try to avoid overload conditions in
   the first place.  Normal load-balancing falls into this category.  A
   node might also take other proactive steps to reduce offered load
   based on load information, so that the loaded node never goes into
   overload in the first place.





Campbell, et al.        Expires September 7, 2015               [Page 4]

Internet-Draft              Abbreviated Title                 March 2015


   If the loaded nodes are Diameter servers (or clients in the case of
   server-to-client transactions), both of these uses are most
   effectively accomplished by a Diameter node that performs server
   selection.  Typically, server selection is performed by a node (a
   client or an agent) that is an immediate peer of the server.
   However, there are scenarios (see Section 10) where a client or proxy
   that is not the immediate peer to the selected servers performs
   server selection.  In this case, the client or proxy enforces the
   server selection by inserting a Destination-Host AVP.

      For example, a Diameter node (e.g. client) can use a redirect
      agent to get candidate destination host addresses.  The redirect
      agent might return several destination host addresses, from which
      the Diameter node selects one.  The Diameter node can use load
      information received from these hosts to make the selection.

   Just as load information can be used as part of server selection, it
   can also be used as input to the selection of the next-hop peer to
   which a request is to be routed.

   One area that requires thought is how load information is used, if at
   all, in the presence of an overload report from the same Diameter
   node.  It might be that the load information from that Diameter node
   is ignored for the duration of the time that the overload report is
   in effect.  It might also be possible that the load information can
   aid in the routing of non-abated requests targeted for the overloaded
   Diameter node.

4.  Piggy-Backing vs a Dedicated Application.

   [I-D.roach-dime-overload-ctrl] imbeds load and overload information
   onto messages of existing applications.  This is known as a "piggy-
   back" approach.  Such an approach has the advantage of not requiring
   new messages to carry load information.  It has an additional
   advantage of scaling with load; that is, the more the transaction
   load, the more opportunities to send load information.

   DOIC [I-D.ietf-dime-ovli] also uses a piggy-backed approach to send
   OLRs.  Given the potentially tight connection between load and
   overload information, there may be advantages to maintaining
   consistency with DOIC.

   [I-D.tschofenig-dime-dlba] used a dedicated application to carry load
   information.  This application has quasi-subscription semantics,
   where a client requests updates according to a cadence.  The server
   can send unsolicited updates if the load level changes between
   updates in the cadence.




Campbell, et al.        Expires September 7, 2015               [Page 5]

Internet-Draft              Abbreviated Title                 March 2015


   [I-D.korhonen-dime-ovl] also used a dedicated application, but
   allowed nodes to send unsolicited reports containing load and
   overload information.  The mechanism has an issue that the sender of
   load information may not know which other nodes need the information.
   It may be possible to infer that information from other application
   messages handled by the sender.

   Another potential approach is that of a dedicated Diameter
   application with a slightly different subscription semantic than that
   of [I-D.tschofenig-dime-dlba].  In such an application, a node that
   consumes load information sends a Diameter request to the source of
   the load information.  This request indicates that the consumer
   wishes to receive load information for some period of time.  The load
   source would send periodic Diameter requests indicating the current
   load level, until such time that the subscription period expired, or
   the subscribe explicitly unsubscribed.  After the initial
   notification, the sender would only send updates when the load level
   changed.

5.  Which Nodes Exchange Load Information?

   Section 10 illustrates a number of Diameter network topologies where
   load information may be useful.  However, there are potentially
   limitless configurations where load information might be used to make
   peer and server selection choices.  Nodes may be unaware of the
   topology beyond their immediate peers, which may limit the utility of
   load information for nodes beyond that peer.

   There may in fact be scenarios where a peer-selection decision is
   impacted by the load of non-adjacent nodes, or where a node needs to
   force selection of a particular non-adjacent server.  While explicit
   knowledge of the load of such non-adjacent nodes may be useful in
   such decisions, the working group should consider whether this
   utility is worth the added complexity.

      For instance, one approach would be to support two types of load
      reports, endpoint load reports and peer load reports.  In this
      scenario, load reports would likely require an AVP indicating the
      Diameter node to which the report applies.  This would be needed
      to differentiate between endpoint load reports and next hop load
      reports.  This would imply that a single message will likely have
      two load reports, one for the endpoint and one for the next hop.
      This would also add complexity in agents, sometimes needing to
      strip next hop load reports and sometimes not.

   Previous load related efforts have made different assumptions about
   which Diameter nodes exchange load information.




Campbell, et al.        Expires September 7, 2015               [Page 6]

Internet-Draft              Abbreviated Title                 March 2015


   [I-D.roach-dime-overload-ctrl] operated in a strictly peer-to-peer
   mode.  Each node would only learn the load (and overload) information
   from its immediate peers.

   [I-D.korhonen-dime-ovl] and [I-D.tschofenig-dime-dlba] are each
   effectively any-to-any.  That is, they each allowed any node to send
   load information to any other node that supported the dedicated
   overload or load application, respectively.

   In the latter case, load is effectively sent between clients and
   servers of the dedicated application, but those roles may not match
   the client and server roles for the "main" Diameter applications in
   use.  For example, a pair of adjacent diameter agents might be
   "client" and "server" for the dedicated "load" application,
   effectively creating a peer-to-peer relationship similar to that of
   [I-D.roach-dime-overload-ctrl].

   Each approach has advantages.  Peer-to-peer transmission covers the
   case when server selection is done by the servers immediate peers.
   Additionally, selection of non-terminal nodes is generally done on a
   peer-to-peer basis.  If the loaded node is an agent, for example, the
   load information is only useful to immediate peers.  Peer-to-peer
   transmission is the easiest to negotiate.  (See Section 9)

   Any-to-Any transmission offers more flexibility, and could
   potentially cover the case where server selection is done by nodes
   that are not peers to the candidate servers.

6.  Scope of Load Information

   Load information could refer to several different scopes:

   o  Load of a Node -- The load information refers to the load for an
      entire Diameter host, that is a Client, Agent, or Server described
      by a Diameter Identity.

   o  Load of an Application -- The load for a specific Diameter node
      that supports multiple Diameter applications might differ between
      applications.

   o  Load of a set of nodes -- The load would likely be the aggregated
      load of the nodes in the set.  This would likely require a
      separate Diameter identity be assigned to the set of nodes and the
      load information would be associated with that Diameter identity.

   o  Aggregate Load -- Different paths via different agents may exist
      between a node making a peer selection decision and the final




Campbell, et al.        Expires September 7, 2015               [Page 7]

Internet-Draft              Abbreviated Title                 March 2015


      destination of the request.  The least loaded destination may only
      be reachable via certain peers.

   o  Load of an agent plus load of a Diameter endpoint -- Different
      paths via different Diameter agents may exist between the node
      doing the server selection and the targeted Diameter endpoint.
      The load information on the Diameter endpoint might be used for
      server selection and the load information on the agent might be
      used for selecting the next hop in the route to the Diameter
      endpoint.

   The "scope" of load information defines what the load indication
   applies to.  For example, load could apply to a whole Diameter node,
   or a node could report different load for different application.  It
   might be possible to have a load value for a whole realm, or a group
   of nodes.

   [I-D.roach-dime-overload-ctrl] has a very expressive concept of
   scope, which applies both to load and overload information.  It
   defines the scopes of "Destination-Realm", "Application-ID",
   "Destination-Host", "Host", "Connection", "Session", and "Session-
   Group".  Scopes can be combined.

   [I-D.tschofenig-dime-dlba] does not have an explicit concept of
   scope.  Load information describes the load of a server for all
   Diameter purposes.

   [I-D.korhonen-dime-ovl] defines several scopes for overload
   information.  However, load information applies to the a whole node.

   One view is that the load level of a Diameter node will usually apply
   to the whole node.  In this case, the working group should consider a
   single "whole node" scope for load information.  Alternatively, a
   "per-connection" scope could simulate "whole node" scope without
   requiring the recipient to pay attention to whether multiple
   transport connections terminate at the same peer.

   Other scopes might also be considered based on the analysis of the
   use cases identified for the use of load information.

7.  Frequency of Sending Load Information

   While it is true that a node always has a discrete load, a
   determination needs to be made as to the frequency with which load
   information is sent.






Campbell, et al.        Expires September 7, 2015               [Page 8]

Internet-Draft              Abbreviated Title                 March 2015


   This interacts with the method for transporting load information --
   piggy-backed versus a dedicated application -- discussed in
   Section 5.

   With a piggy-backed approach the following alternatives exist:

   1.  Send load information in every message.

   2.  Send load information when it changes by some amount.  For
       instance, only send a new load report when the load value has
       changed by some percentage.

   3.  Send load information every interval of time.  With this
       approach, load information would be sent every some number of
       seconds.

   With alternatives 2 and 3 there would need to be a mechanism for the
   sender of the load information to ensure that all consumers of the
   load information receive the periodic load information.  This is more
   straightforward if the load information is sent only to peers.  It
   becomes more difficult if the load information is sent to non
   adjacent nodes.  This might require option one if the load mechanism
   supports sending of load information to non adjacent nodes.

   If a dedicated application is used for transporting of load
   information then part of the application definition would need to
   define the frequency of sending load information.  Options 2 and 3 in
   the above list would be the likely alternatives.

8.  Load Information Semantics

   Both [I-D.tschofenig-dime-dlba] and [I-D.korhonen-dime-ovl] define
   load level to be a range between zero and some maximum value, where
   zero means no load at all and the max value means fully loaded.  The
   former uses a range of 0-10, while the later uses 0-100.

   [I-D.roach-dime-overload-ctrl] treats load information as a strictly
   relative weighting factor.  The weight is only meaningful when load-
   balancing across multiple destinations.  That is, a maximum load
   value does not necessarily imply that the node is cannot handle more
   traffic.  The load level scale is zero to 65535.  That scale was
   chosen to match the resolution of the weight field from a DNS SRV
   record, [RFC2782]








Campbell, et al.        Expires September 7, 2015               [Page 9]

Internet-Draft              Abbreviated Title                 March 2015


9.  Is Negotiation of Support Needed?

   The working group should discuss whether a load conveyance mechanism
   requires negotiation or declaration of support.  Several
   considerations apply to this discussion.

   If load information is treated as a hint, it can be safely ignored by
   nodes that don't understand it.  However, security considerations may
   apply if load information is accidentally leaked across a non-
   supporting node to a node that is not authorized to receive it.

   If load information is conveyed using a dedicated Diameter
   application, the normal mechanisms for negotiation support for
   Diameter applications apply.  However, the Diameter Capabilities
   Exchange [RFC6733] mechanism is inherently peer-to-peer.  If there is
   a need to convey load information across a node that does not
   understand the mechanism, the standard Diameter mechanism would
   involve probing for support by sending load requests and watching for
   error answers with a result code of DIAMETER_APPLICATION_UNSUPPORTED.
   If the probe request also includes load information, there is again a
   potential for leaking load information to unauthorized parties.

   If load information was treated in a strictly peer-to-peer fashion,
   there would be no need to probe to see if non-adjacent nodes support
   the mechanism.  However, there would still be a need to control
   whether a non-supporting node would leak load information.  Such a
   leak could be prevented if adjacent peers declared support, and never
   sent load information to a peer that did not declare support.

   A peer-to-peer mechanism would also need a way to make sure that, if
   load information leaked across a non-supporting node, the receiving
   node would not mistakenly think the information came from the non-
   supporting node.  This could be mitigated with a mechanism to declare
   support as in the previous paragraph, or with a mechanism to identify
   the origin of the load information.  In the latter case, the
   receiving node would treat any load information as invalid if the
   origin of that information did not match the identity of the peer
   node.

10.  Topology Scenarios

   This section presents a number of Diameter topology scenarios, and
   discusses how load information might be used in each scenario.
   Nothing in this section should be construed to mean that a given
   scenario is in scope for this effort, or even a good idea.  Some
   scenarios might be considered as not relevant in practice and
   subsequently discarded.




Campbell, et al.        Expires September 7, 2015              [Page 10]

Internet-Draft              Abbreviated Title                 March 2015


10.1.  No Agent

   Figure 1 shows a simple client-server scenario, where a client picks
   from a set of candidate servers available for a particular realm and
   application.  The client selects the server for a given transaction
   using the load information received from each server.

     ------S1
    /
   C
    \
     ------S2


                  Figure 1: Basic Client Server Scenario

      Open Issue: Will a Diameter node include potential peers that it
      is not currently connected to as part of the candidate set?  It is
      unlikely the client would have load information from peers that it
      is not currently connected to.

      Note: The use of dynamic connections needs to be considered.

10.2.  Single Agent

   Figure 2 shows a client that sends requests to an agent.  The agent
   selects the request destination from a set of candidate servers,
   using load information received from each server.  The client does
   not need to receive load information, since it does not select
   between multiple agents.

          ------S1
         /
   C----A
         \
          ------S2


                      Figure 2: Simple Agent Scenario

10.3.  Multiple Agents

   Figure 3 shows a client selecting between multiple agents, and each
   agent selecting from multiple servers.  The client selects an agent
   based on the load information received from each agent.  Each agent
   selects a server based on the load information received from its
   servers.




Campbell, et al.        Expires September 7, 2015              [Page 11]

Internet-Draft              Abbreviated Title                 March 2015


   This scenario adds a complication that one set of servers may be more
   loaded than the other set.  If, for example, S4 was the least loaded
   server, C would need to know to select agent A2 to reach S4.  This
   might require C to receive load information from the servers as well
   as the agents.  Alternatively, each agent might use the load of its
   servers as an input into calculating its own load, in effect
   aggregating upstream load.

   Similarly, if C sends a host-routed request [I-D.ietf-dime-ovli], it
   needs to know which agent can deliver requests to the selected
   server.  Without some special, potentially proprietary, knowledge of
   the topology upstream of A1 and A2, C would select the agent based on
   the normal peer selection procedures for the realm and application,
   and perhaps consider the load information from A1 and A2.  If C sends
   a request to A1 that contains a Destination-Host AVP with a value of
   S4, A1 will not be able to deliver the request.

           -----S3
          /
     ---A1------S1
    /
   C
    \
     ---A2------S2
          \
           ---- S4


                   Figure 3: Multiple Agents and Servers

10.4.  Linked Agents

   Figure 4 shows a scenario similar to that of Figure 3, except that
   the agents are linked, so that A1 can forward a request to A2, and
   vice-versa.  Each agent could receive load information from the
   linked agent, as well as its connected servers.

   This somewhat simplifies the complication from Figure 3, due to the
   fact that C does not necessarily need to choose a particular agent to
   reach a particular server.  But it creates a similar question of how,
   for example, A1 might know that S4 was less loaded than S1 or S3.
   Additionally, it creates the opportunity for sub-optimal request
   paths.  For example [C,A1,A2,S4] vs. [C,A2,S4].

   A likely application for linked agents is when each agent prefers to
   route only to directly connected servers and only forwards requests
   to another agent under exceptional circumstances.  For example, A1
   might not forward requests to A2 unless both S1 and S3 are



Campbell, et al.        Expires September 7, 2015              [Page 12]

Internet-Draft              Abbreviated Title                 March 2015


   overloaded.  In this case, A1 might use the load information from S1
   and S3 to select between those, and only consider the load
   information from A2 (and other connected agents) if it needs to
   divert requests to different agents.

            -----S3
           /
      ---A1------S1
    /    |
   C     |
    \    |
      ---A2------S2
           \
            ---- S4


                          Figure 4: Linked Agents

   Figure 5 is a variant of Figure 4.  In this case, C1 sends all
   traffic through A1 and C2 sends all traffic through A2.  By default,
   A1 will load balance traffic between S1 and S3 and A2 will load
   balance traffic between S2 and S4.

   Now, if S1 S3 are significantly more loaded than S2 S4, A1 may route
   some C1 traffic to A2.  This is non optimal path but allows a better
   load balancing between the servers.  To achieve this, A1 needs to
   receive some load info from A2 about S2/S4 load.

            -----S3
           /
   C1----A1------S1
         |
         |
         |
   C2----A2------S2
           \
            ---- S4


                          Figure 5: Linked Agents

10.5.  Shared Server Pools

   Figure 6 is similar to Figure 4, except that instead of a link
   between agents, each agent is linked to all servers.  (The links to
   each set of servers should be interpreted as a link to each server.
   The links are not shown separately due to the limitations of ASCII
   art.)



Campbell, et al.        Expires September 7, 2015              [Page 13]

Internet-Draft              Abbreviated Title                 March 2015


   In this scenario, each agent can select among all of the servers,
   based on the load information from the servers.  The client need only
   be concerned with the load information of the agents.


     ---A1---S[1], S[2]...S[p]
    /     \ /
   C       x
    \     / \
     ---A2---S[p+1], S[p+2] ...S[n]


                       Figure 6: Shared Server Pools

10.6.  Agent Chains

   The scenario in Figure 7 is similar to that of Figure 3, except that,
   instead of the client possibly needing to select an agent that can
   route requests to the least loaded server, in this case A1 and A2
   need to make similar decisions when selecting between A3 or A4.  As
   the former scenario, this could be mitigated if A3 and A4 aggregate
   upstream loads into the load information they report downstream.

     ---A1---A3----S[1], S[2]...S[p]
    /   | \ /
   C    |  x
    \   | / \
     ---A2---A4----S[p+1], S[p+2] ...S[n]


                          Figure 7: Agent Chains

10.7.  Fully Meshed Layers

   Figure 8 extends the scenario in Figure 6 by adding an extra layer of
   agents.  But since each layer of nodes can reach any node in the next
   layer, each node only needs to consider the load of its next-hop
   peer.

     ---A1---A3---S[1], S[2]...S[p]
    /   | \ / |\ /
   C    |  x  | x
    \   | / \ |/ \
     ---A2---A4---S[p+1], S[p+2] ...S[n]


                            Figure 8: Full Mesh




Campbell, et al.        Expires September 7, 2015              [Page 14]

Internet-Draft              Abbreviated Title                 March 2015


10.8.  Partitions

   A Diameter network with multiple is said to be "partitioned" when
   only a subset of available servers can server a particular realm-
   routed request.  For example, one group of servers may handle users
   whose names start with "A" through "M", and another group may handle
   "N" through "Z".

   In such a partitioned network, nodes cannot load-balance requests
   across partitions, since not all servers can handle the request.  A
   client, or an intermediate agent, may still be able to load-balance
   between servers inside a partition.

10.9.  Active-Standby Nodes

   The previous scenarios assume that traffic can be load balanced among
   all peers that are eligible to handle a request.  That is, the peers
   operate in an "active-active" configuration.  In an "active-standby"
   configuration, traffic would be load-balanced among active peers.
   Requests would only be sent to peers in a "standby" state if the
   active peers became unavailable.  For example, requests might be
   diverted to a stand-by peer if one or more active peers becomes
   overloaded.

10.10.  Addition and removal of Nodes

   When a Diameter node is added, the new node will start by advertising
   its load.  Downstream nodes will need to factor the new load
   information into load balancing decisions.  The downstream nodes
   should attempt to ensure a smooth increase of the traffic to the new
   node, avoiding an immediate spike of traffic to the new node.  It
   should be determined if this use case is in the scope of the load
   control mechanism.

   When removing a node in a controlled way (e.g. for maintenance
   purpose, so outside a failure case), it might be appropriate to
   progressively reduce the traffic to this node by routing traffic to
   other nodes.  Simple load information (load percentage) would be not
   sufficient.  It should be determined if this use case is in the scope
   of the load control mechanism.

11.  Security Considerations

   Load information may be sensitive information in some cases.
   Depending on the mechanism. an unauthorized recipient might be able
   to infer the topology of a Diameter network from load information.
   Load information might be useful in identifying targets for Denial of
   Service (DoS) attacks, where a node known to be already heavily



Campbell, et al.        Expires September 7, 2015              [Page 15]

Internet-Draft              Abbreviated Title                 March 2015


   loaded might be a tempting target.  Load information might also be
   useful as feedback about the success of an ongoing DoS attack.

   Any load information conveyance mechanism will need to allow
   operators to avoid sending load information to nodes that are not
   authorized to receive it.  Since Diameter currently only offers
   authentication of nodes at the transport level, any solution that
   sends load information to non-peer nodes might require a transitive-
   trust model.

12.  IANA Considerations

   This document makes no requests of IANA.

13.  References

13.1.  Normative References

   [I-D.ietf-dime-ovli]
              Korhonen, J., Donovan, S., Campbell, B., and L. Morand,
              "Diameter Overload Indication Conveyance", draft-ietf-
              dime-ovli-03 (work in progress), July 2014.

   [RFC6733]  Fajardo, V., Arkko, J., Loughney, J., and G. Zorn,
              "Diameter Base Protocol", RFC 6733, October 2012.

   [RFC7068]  McMurry, E. and B. Campbell, "Diameter Overload Control
              Requirements", RFC 7068, November 2013.

13.2.  Informative References

   [I-D.korhonen-dime-ovl]
              Korhonen, J. and H. Tschofenig, "The Diameter Overload
              Control Application (DOCA)", draft-korhonen-dime-ovl-01
              (work in progress), February 2013.

   [I-D.roach-dime-overload-ctrl]
              Roach, A. and E. McMurry, "A Mechanism for Diameter
              Overload Control", draft-roach-dime-overload-ctrl-03 (work
              in progress), May 2013.

   [I-D.tschofenig-dime-dlba]
              Tschofenig, H., "The Diameter Load Balancing Application
              (DLBA)", draft-tschofenig-dime-dlba-00 (work in progress),
              July 2013.






Campbell, et al.        Expires September 7, 2015              [Page 16]

Internet-Draft              Abbreviated Title                 March 2015


   [RFC2782]  Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for
              specifying the location of services (DNS SRV)", RFC 2782,
              February 2000.

Authors' Addresses

   Ben Campbell
   Oracle
   7460 Warren Parkway # 300
   Frisco, Texas  75034
   USA

   Email: ben@nostrum.com


   Steve Donovan (editor)
   Oracle
   7460 Warren Parkway # 300
   Frisco, Texas  75034
   United States

   Email: srdonovan@usdonovans.com


   Jean-Jacques Trottin
   Alcatel-Lucent
   Route de Villejust
   91620 Nozay
   France

   Email: jean-jacques.trottin@alcatel-lucent.com




















Campbell, et al.        Expires September 7, 2015              [Page 17]