SIPCORE                                                        D. Worley
Internet-Draft                                                   Ariadne
Intended status: Standards Track                       February 17, 2017
Expires: August 21, 2017


            TBD: Happy Earballs: Success with Dual-Stack SIP
                   draft-worley-sip-happy-earballs-00

Abstract

   TBD: The Session Initiation Protocol (SIP) supports multiple
   transports running both over IPv4 and IPv6 protocols.  In more and
   more cases, a SIP user agent (UA) is connected to network interfaces
   with multiple address families.  In these cases sending a message
   from a dual stack client to a dual stack server may suffer from the
   issues described in [RFC6555] ("Happy Eyeballs"): the UA attempts to
   send the message using IPv6, but IPv6 connectivity is not working to
   the server.  This can cause significant delays in the process of
   sending the message to the server.  This negatively affects the
   user's experience.

   TBD: This document builds on [RFC6555] by modifying the procedures
   specified in [RFC3263] and related specifications to require that a
   client ensure that communication targets are accessible before
   sending messages to them, to allow a client to contact targets out of
   the order required by other specifications, and to require a client
   to properly distribute the message load among targets over time.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on August 21, 2017.


Worley                   Expires August 21, 2017                [Page 1]

Internet-Draft             TBD: Happy Earballs             February 2017


Copyright Notice

   Copyright (c) 2017 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   5
   3.  Structure of This Document  . . . . . . . . . . . . . . . . .   7
     3.1.  Scope of Applicability  . . . . . . . . . . . . . . . . .   8
   4.  Baseline Procedures . . . . . . . . . . . . . . . . . . . . .   8
     4.1.  Target Ordering . . . . . . . . . . . . . . . . . . . . .  11
       4.1.1.  Prioritization Node . . . . . . . . . . . . . . . . .  11
       4.1.2.  Unordered Node  . . . . . . . . . . . . . . . . . . .  12
       4.1.3.  Load-Balancing Node . . . . . . . . . . . . . . . . .  15
   5.  Procedure Modifications . . . . . . . . . . . . . . . . . . .  15
     5.1.  Permitted to Reorder Targets  . . . . . . . . . . . . . .  15
     5.2.  Must Preserve Traffic Distribution  . . . . . . . . . . .  16
     5.3.  Address Family Preference . . . . . . . . . . . . . . . .  17
     5.4.  Address Selection . . . . . . . . . . . . . . . . . . . .  18
     5.5.  Vias  . . . . . . . . . . . . . . . . . . . . . . . . . .  18
     5.6.  DNS Caching . . . . . . . . . . . . . . . . . . . . . . .  19
     5.7.  Unused Flows  . . . . . . . . . . . . . . . . . . . . . .  19
     5.8.  Debugging and Troubleshooting . . . . . . . . . . . . . .  19
   6.  Consequences of the New Requirements  . . . . . . . . . . . .  19
   7.  Examples  . . . . . . . . . . . . . . . . . . . . . . . . . .  23
     7.1.  Two Unordered Targets, Both Cached  . . . . . . . . . . .  23
     7.2.  Two Unordered Targets, One Cached . . . . . . . . . . . .  24
     7.3.  Two Unordered Targets, Neither Cached . . . . . . . . . .  24
     7.4.  Two Prioritized Targets, Both Cached  . . . . . . . . . .  25
     7.5.  Two Prioritized Targets, the Second Cached  . . . . . . .  26
     7.6.  Two Prioritized Targets, the First Cached . . . . . . . .  26
     7.7.  Three Targets . . . . . . . . . . . . . . . . . . . . . .  26
   8.  Heuristics  . . . . . . . . . . . . . . . . . . . . . . . . .  27
     8.1.  A Simplified Method . . . . . . . . . . . . . . . . . . .  28
   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  31
   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  32


Worley                   Expires August 21, 2017                [Page 2]

Internet-Draft             TBD: Happy Earballs             February 2017


   11. History . . . . . . . . . . . . . . . . . . . . . . . . . . .  32
     11.1.  Changes from draft-worley-sip-he-connection-01 to draft-
            worley-sip-happy-earballs-00 . . . . . . . . . . . . . .  32
     11.2.  Changes from draft-worley-sip-he-connection-00 to draft-
            worley-sip-he-connection-01  . . . . . . . . . . . . . .  32
     11.3.  Changes from draft-johansson-sip-he-connection-01 to
            draft-worley-sip-he-connection-00  . . . . . . . . . . .  32
   12. References  . . . . . . . . . . . . . . . . . . . . . . . . .  33
     12.1.  Normative References . . . . . . . . . . . . . . . . . .  33
     12.2.  Informative References . . . . . . . . . . . . . . . . .  34
   Appendix A.  Implementing Load Balancing  . . . . . . . . . . . .  35
   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .  35
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  36

1.  Introduction

                            Earballs -- n., another word for ears.
                             Made famous by the animated American TV
                             spy comedy, "Archer".

                             "Ow, my earballs!" -- Cheryl Tunt, "Archer"

                             -- from "Urban Dictionary"

   The Session Initiation Protocol (SIP) [RFC3261] and the documents
   that extended it provide support for both IPv4 and IPv6.  However,
   this support has problems with environments that are characteristic
   of the transitional migratory phase from IPv4 to IPv6 networks.
   During this phase, many server and client implementations run on
   dual-stack hosts.  In such environments, a dual-stack host will
   likely suffer greater connection delay, and by extension an inferior
   user experience, than an IPv4-only host.  The difficulty stems from
   the reality that a device cannot predict whether apparent IPv6
   connectivity to another device is usable; both devices may have IPv6
   addresses and yet some transit network between the two may not
   transport IPv6.  SIP requires a device that transmits a request to
   one destination address (e.g., the apparently useful IPv6 address) to
   wait for a response for a substantial period (usually 32 seconds)
   before transmitting the request to another destination address (the
   less-preferred IPv4 address).  The result is that apparent IPv6
   connectivity that is not functional can cause substantial delays in
   processing SIP requests.  Especially when the requests are call
   setups (INVITE requests) this creates very poor user experience.

   TBD: The need to remedy this diminished performance of dual-stack
   hosts led to the development of the "Happy Eyeballs" [RFC6555]
   algorithm, which has since been implemented in many protocols and
   applications.


Worley                   Expires August 21, 2017                [Page 3]

Internet-Draft             TBD: Happy Earballs             February 2017


   TBD: The concepts in this document are elaborated from those
   developed in [RFC6555], and so some background information in RFC
   6555 is not repeated here.  The reader is encouraged to read the
   available documentation regarding implementations of RFC 6555, as
   well as study Open Source implementations, in order to learn from the
   experience accumulated since the publishing of RFC 6555 in 2012.

   TBD: A SIP client uses DNS to find a server based on a SIP URI.  This
   process is described in [RFC3263] and updated in [RFC7984].  Using
   this process, a list of "targets" is constructed, where each target
   consists of an IP address, a port number, and a protocol (e.g., TCP,
   UDP, TLS) by which to contact that address/port.  The process
   proceeds by constructing a sequence of host names, possibly by
   looking up NAPTR and/or SRV DNS records, and then for each host name
   looking up DNS address records (for all address families supported by
   the client) to generate the list of IP addresses for targets that are
   derived from that host name.  The addresses for each host name are
   ordered using the client's destination selection rules [RFC6724].
   The sorted targets for all the host names are then concatenated into
   the sequence of targets to which the client will attempt to send the
   SIP message.

   TBD: Previously, the client contacts the targets in order until one
   is contacted successfully.  In order to contact a target, the client
   establishes a transport connection (if necessary), sends the message
   using the transport (possibly resending the message several times),
   and then (for requests) waits for a response (either provisional or
   final).  The process ends successfully if the client receives a
   response.  The process ends unsuccessfully if the client receives a
   permanent error from the transport layer or if a SIP timer (Timer B
   or Timer F in [RFC3261]) expires.  Timeouts generally default to 32
   seconds.

   TBD: If the user has to wait for even one timeout, this will
   seriously degrade the user experience.  Thus, it is desirable to
   minimize the number of times the client has timeouts when sending
   requests.

   TBD: If the target list contains both IPv6 addresses and IPv4
   addresses, this procedure can degrade the user's experience in common
   situations.  Typically, this problem arises when the client has an
   IPv6 interface, the server's preferred address is an IPv6 address,
   but the transit networks between the client and server do not carry
   IPv6.  This can cause the client to attempt to send a SIP request for
   32 seconds before it times out that target and continues with an IPv4
   target.  This problem parallels a problem that was widely seen in web
   browsers that was cured by specifying that web browsers should use a


Worley                   Expires August 21, 2017                [Page 4]

Internet-Draft             TBD: Happy Earballs             February 2017


   "Happy Eyeballs" algorithm [RFC6555] to determine the order in which
   to contact target addresses.

   TBD: This document specifies an amendment to these procedures, by
   which the subsequences of targets derived from individual host names
   may be contacted in a different order than is specified by the
   destination selection rules.  As in [RFC6555], the algorithm that the
   client uses is not specified by this document, but this document
   places requirements on the algorithm that improve the user's
   experience without unduly burdening the Internet infrastructure.  By
   analogy with the name "Happy Eyeballs" for similar algorithms in web
   browsers, we label these algorithms "Happy Earballs" [UD].

2.  Terminology

   The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

   baseline:  the prior specifications of the behavior of a client for
      sending a message to a goal.  The baseline specifications are
      modified by this document.

   cache:  (verb) to store temporarily information regarding the
      response time of a target so as to accelerate future message
      transmission; (noun) the collection of information so stored

   client:  the device that must send a message

   flow:  a group of transmissions to a target which are considered
      related.  For connection-oriented protocols, is the data carried
      by a connection.  For connectionless protocols, is all messages
      sent to a particular target (5-tuple).  For protocols with
      security associations, is all messages sent within a particular
      security association.

   goal:  the identification of a particular server.  May be a URI, a
      TSAP, or information provided by the context of the message.

   initial:  a target which has no target prioritized before it
      (considered relative to all targets in the target set, or some
      subset or rearrangement of the target set, depending on the
      context)

   Limit(t):  a function converting one time value into another.  if
      RTT(T1) > Limit(RTT(T0)), then target T1 responds "too slowly"
      relative to the response time of target T0, and T1 is considered
      non-responsive.  Depends on two parameters, "m" and "f".


Worley                   Expires August 21, 2017                [Page 5]

Internet-Draft             TBD: Happy Earballs             February 2017


      Limit(infinity) is considered to be infinity.  TBD: What is a
      better name for this?

   normal:  a target which is not slow (relative to a particular goal)

   NSAP:  "Network Service Access Point", the identification of a
      network interface, which comprises an address family and an
      address

   probe:  a transport operation that attempts to determine if a target
      is responsive, without transmitting a message.  Since a probe does
      not send a message, if the transmission fails, it does not commit
      the client to waiting a timeout period before sending the message
      to another target.

   quick:  a target whose cached RTT() is less than Limit(0), and thus
      is never slow for any goal

   RTT(T):  for a target T, the round-trip response time of T.  There is
      a special value of "infinity" if T does not respond at all.
      Collectively, these values are called "RTT() values".

   responsive:  a property of a target relative to the set of targets
      for a goal: the response time of the target is sufficiently short
      when compared with the response time of other targets.  See TBD
      Section 5 for the complete definition.

   send:  attempted transmission of a (possibly modified copy of) a
      message to a target.  Contrasted with "successful send", which is
      when the message is received by the server or when the client
      detects that the message is received by the server.  Does not
      include "probe" transport operations.

   server:  the (conceptual) device to which a message is to be sent.
      May consist of multiple physical devices.

   T1:  the value of that name used in the procedures of [RFC3261],
      which is commonly the round-trip time estimate of the relevant
      network, and defaults to 500ms

   slow:  a target T0 which appears to be slow due to cached RTT()
      information, i.e., there is another target T1 of the goal for
      which RTT(T0) > Limit(RTT(T1)) for the cached RTT() values.  This
      includes the case where RTT(T0) is infinity, i.e., T0 does not
      respond at all.  Opposite of "normal".

   target:  the complete specification of a transport to be used to send
      a message from the client to the server.  A target is commonly


Worley                   Expires August 21, 2017                [Page 6]

Internet-Draft             TBD: Happy Earballs             February 2017


      conceptualized as "protocol/address/port" (which is a TSAP), but
      the target also includes the TSAP that will be used as the source
      of the communication, and so "5-tuple" is more accurate.  In many
      cases, the source TSAP is determined by the destination TSAP, so
      it is not mentioned.  Perforce, the transport protocol and address
      family of the source and destination TSAPs are the same.

   timeout:  (noun) the period of time after sending a message which a
      client must wait before it is permitted to send the message to
      another target without receiving positive indication of the
      failure of the first transmission.  For sending requests, either
      Timer B or Timer F.  (verb) the event when the timeout period has
      expired.  [RFC3261].

   TSAP:  "Transport Service Access Point", the identification of an
      endpoint of a transport flow, which usually comprises a transport
      protocol, an NSAP (or network address), and a port number

   traffic:  the messages for a particular goal that are successfully
      sent to a particular target or set of targets; the number of such
      messages sent over a period of time; or the fraction of such
      messages relative to all messages for the goal

   Note: While this document uses the term "dual-stack" based on RFC
   6555 and earlier terminology, its scope includes contexts with more
   than two interfaces and with more than two address families.

3.  Structure of This Document

   This document modifies the procedures with which a client sends a
   message to a server.  It assumes that the context of the message
   provides a "goal", which is the specification of the device or
   collection of devices which are the server, and that there are
   existing "baseline" specifications which translate the goal into a
   set of "targets", and in what order(s) the client may send (possibly
   modified copies of) the message to the targets, until one of the send
   operations is successful.

   This document relaxes the requirements on the client regarding the
   order(s) in which the message is sent to the targets, that is, it
   permits additional orders, so that the client is less likely to have
   to wait for a timeout.  On the whole, when network connectivity is
   imperfect, this allows clients to transmit the messages to servers
   more quickly than they would using the unmodified baseline
   specifications.

   However, this document also places additional restrictions on the
   client's sending behavior to ensure that the overall traffic


Worley                   Expires August 21, 2017                [Page 7]

Internet-Draft             TBD: Happy Earballs             February 2017


   distribution among the targets converges over time to the
   distribution that would have resulted from obeying the baseline
   specifications.

   Following that, this document discusses some consequences of the new
   requirements, including what new orders of targets are permitted,
   what behaviors minimize the time needed to successfully send a
   message, techniques for probing a target (that is, determining if it
   is responsive without sending the message, and thus possibly
   committing to waiting for a timeout period), and suitable approaches
   for caching information about targets.

   This document also requires certain behaviors that ensure that the
   use of IPv6 is not disadvantaged in mixed IPv4/IPv6 networks.  TBD:
   don't forget to write these requirements

   This document also contains a number of miscellaneous requirements to
   optimize the behavior of clients.

3.1.  Scope of Applicability

   This document modifies any SIP target selection processes that are
   defined now or may be defined in the future, excepting those that
   explicitly exempt themselves.

   This document does not affect communications specified to be carried
   only by a single WebSocket transport, as in those contexts there is
   only one transport target (the WebSocket connection), and hence there
   is no target selection process.

   A client MUST NOT consider the set of the target URIs of a "forking"
   operation to be a single goal to which the processes of this document
   apply.  Instead, the modifications MUST be applied to each of those
   URIs as separate goals.  This is because the decision of whether to
   send a request to a later forking target may be affected by the SIP
   response to an earlier transmission.  ([RFC3261] section 16) However,
   a forking proxy may, as part of its policy, apply some or all of
   these procedures to the entirety of a forking operation.

4.  Baseline Procedures

   The situation that this document addresses is when a SIP device is
   required to send a message (which may be either a request or a
   response).  This document uses the term "client" for the device which
   must send the message.  The client is given a "goal", which is the
   specification of the "server", which is the (possibly composite)
   device to which the message is to be sent.  (Both of these usages are


Worley                   Expires August 21, 2017                [Page 8]

Internet-Draft             TBD: Happy Earballs             February 2017


   broader than the usage in [RFC3261].)  The purpose of client is to
   successfully send the message to the server.

   (Note that in the case of a request, when the message is sent to a
   target, a Via header field will be added to the message, and that the
   added Via header field will be different for each target.  This
   document considers all of these versions of the message to be copies
   of the original message to be sent.)

   If the message is a request, the goal is usually the hostport part of
   the URI in either the first Route header field or the request-line.
   If the message is a response, the goal is specified by the first via-
   param in the first Via header field.  If the message is to be sent to
   an outbound proxy as specified by a DHCP option ([RFC3319] or
   [RFC3361]), then the goal is the ordered list of addresses or domain
   names provided by the DHCP option.  In other situations, the goal may
   be specified by other means.

   Baseline specifications (e.g., [RFC3263], [RFC3319], [RFC3361],
   [RFC6724], [RFC7984]) prescribe the construction of a set "targets"
   which are potential transport destinations to which the message can
   be sent.  Which specifications apply is determined by the context of
   the message.  Targets are commonly conceptualized as
   protocol/address/port combinations, but in general they are the pairs
   of source and destination TSAPs that provide the full specification
   of a transport flow.

   For example, the sending of an initial REGISTER message can involve
   six steps of expanding the goal into a list of targets:

      Selecting one of the SIP domain names from the list provided by a
      DHCP option.

      NAPTR translation from a SIP domain name to a server domain name.
      [RFC3263]

      Selecting a transport protocol (e.g., UDP or TCP), if there is no
      transport parameter in the URI.

      SRV translation of a server domain name to server interface names.
      [RFC2782]

      A/AAAA translation of a server interface name to server addresses.

      Selecting a source address to use to communicate with a server
      address.  [RFC6724]


Worley                   Expires August 21, 2017                [Page 9]

Internet-Draft             TBD: Happy Earballs             February 2017


   The process of deriving a set of targets from a goal can be
   conceptualized as constructing a tree, with the root node being the
   goal and the leaf nodes being the targets (whether or not an
   implementation constructs such a representation).  Each non-leaf node
   is expanded into zero or more child nodes by the application of the
   appropriate baseline specification.

   For a particular node, the relevant baseline specification may
   prescribe relationships between the traffic volume sent to the
   subsets of targets that are descended from its children.  E.g., a
   standard may prescribe prioritization, such that if any target
   descended from a higher-priority child is responsive, no traffic
   should be sent to any target descended from a lower-priority child.
   (SRV records and DHCP options can specify prioritization.)
   Similarly, a standard may prescribe load balancing, such that if
   there are responsive targets descended from two children, the ratio
   of traffic to the two subsets targets descended from the two children
   must be a particular (non-zero positive) number.  (SRV records can
   specify load balancing.)  Alternatively, a node may place no
   restrictions on the traffic to the subsets of targets descended from
   its children.

   As always, the construction of the tree and the traffic restrictions
   incorporated into it may be modified by the local policy.  In this
   document, we assume that all modifications are made to the tree that
   summarizes the requirements of the baseline specifications.  This
   makes it easier to determine the the interaction local policy with
   the procedure modifications of this document.  And this assumption
   does not limit the generality of what local policy may do, since the
   local policy can remove any ordering restrictions from the tree, thus
   permitting almost any behavior by the modified procedures.

   The targets as generated by the specified processes MAY be subsetted
   by deleting any targets that the client cannot access for reasons
   such as the client does not implement the protocol, or it does not
   have a network interface that supports the protocol, or it does not
   have a network interface that can communicate with the address.
   Removing these targets at an early stage of processing does not
   affect the on-the-wire behavior of either the baseline processes or
   the modified processes, since sends to such targets fail immediately.

   What constitutes failure of a send depends on the situation, and may
   be a transport protocol failure, the absence of a timely 100 Trying
   response, or a 503 response ([RFC3261] section 21.5.4 and [RFC3263]
   section 4.3).  For any particular message, either the overall sending
   process fails or the message is successfully sent to exactly one
   target.


Worley                   Expires August 21, 2017               [Page 10]

Internet-Draft             TBD: Happy Earballs             February 2017


   In the worst-case situation, the process may require waiting for one
   or more transaction timeouts (e.g., Timer B or Timer F in [RFC3261])
   before successfully transmitting the message to a target.  As the
   timeouts are typically 32 seconds, such a wait severely impacts the
   user experience.

4.1.  Target Ordering

   The baseline specifications assume that the client will effectively
   generate an order in which to contact the targets, then the client
   will sequence through the list, sending the message to each target
   until one of the sends is deemed to be successful.  (Each send may
   include retransmissions of the message.)  This is because at any
   stage, the client's next action is determined only by the goal and
   whether sending to previous targets has failed -- the first target in
   the order is the target that the client will choose first (which
   depends only on the goal), the second target in the order is the
   target that the client will choose if and when the send to the first
   target fails (which depends only on the goal and the identity of the
   first target), etc.

   (In mathematical terms, the target order is a total ordering of the
   targets that is compatible with the partial ordering of the targets
   specified by the traffic restrictions.)

4.1.1.  Prioritization Node

   The order will be compatible with the traffic restrictions imposed by
   the specifications on the targets.  For example, if the children of a
   node are prioritized, all of the targets descended from a higher-
   priority child must precede all of the targets descended from a
   lower-priority child.  Suppose the tree of targets has one interior
   node that specified prioritization of two targets.  This can result
   from these DNS records:

       _sip._udp.example.com.    SRV    1 1 5060 sip1.example.com
       _sip._udp.example.com.    SRV    2 1 5060 sip2.example.com
       _sip._udp.example.com.    SRV    3 1 5060 sip3.example.com
       sip1.example.com.         A      192.0.2.1
       sip2.example.com.         A      192.0.2.2
       sip3.example.com.         A      192.0.2.3

   We show the tree with the targets from left to right from highest
   priority to lowest priority:


Worley                   Expires August 21, 2017               [Page 11]

Internet-Draft             TBD: Happy Earballs             February 2017


            |
       -priority--
       |    |    |
       A    B    C

   We can then represent the traffic restrictions in a graph which shows
   a traffic restriction that requires sending to target A before
   sending to target B by a line joining A on the left to B on the
   right.  We add a fictitious "start" and "finish" nodes, represented
   by "*":

       *----A----B----C----*

   There is only one allowed target order:

       A B C

4.1.2.  Unordered Node

   If the children of a node have no traffic restrictions, there is no
   collective relationship between the targets descended from its
   children, and the targets descended from different children may be
   appear in any order, and can even be interleaved.  We show the tree
   of a simple example:

       sip.example.com.         A      192.0.2.1
       sip.example.com.         A      192.0.2.2
       sip.example.com.         A      192.0.2.3

            |
       -unordered-
       |    |    |
       A    B    C

   We then represent the lack of traffic restrictions by a graph which
   has no lines between targets:

         A
        / \
       *-B-*
        \ /
         C

   There are six allowed target orders:


Worley                   Expires August 21, 2017               [Page 12]

Internet-Draft             TBD: Happy Earballs             February 2017


       A B C

       A C B

       B A C

       B C A

       C A B

       C B A

   An prioritized pair of hosts each with an unordered pair of targets
   results in this tree:

                 |
           ---priority---
           |            |
       unordered    unordered
       |       |    |       |
       A       B    C       D

   with this graph, which we simplify by adding a fictitious target
   which must follow both A and B and must precede both C and D:

         A   C
        / \ / \
       *   *   *
        \ / \ /
         B   D

   There are four allowed target orders:

       A B C D

       A B D C

       B A C D

       B A D C

   An unordered pair of hosts each with prioritized pair of targets
   results in this tree:


Worley                   Expires August 21, 2017               [Page 13]

Internet-Draft             TBD: Happy Earballs             February 2017


                |
          --unordered--
          |           |
       priority    priority
       |      |    |      |
       A      B    C      D

   with this graph:

         A----B
        /      \
       *        *
        \      /
         C----D

   There are six allowed target orders:

       A B C D

       A C B D

       C A B D

       A C D B

       C A D B

       C D A B

   When the client is allowed to select any of the available source
   addresses for a send, each source address (combined with the
   destination address) generates a separate target.  Of the targets,
   the one selected by the default source address selection rules is
   preferred, and the remainder are unordered.  This results in a tree
   with this form:

            |
       --priority--
       |          |
       |      -unordered-
       |      |    |    |
       A      B    C    D

   with this graph:


Worley                   Expires August 21, 2017               [Page 14]

Internet-Draft             TBD: Happy Earballs             February 2017


                   B
                  / \
       *----A----*-C-*
                  \ /
                   D

   There are eight allowed target orders:

       A B C D

       A B D C

       A C B D

       A C D B

       A D B C

       A D C B

4.1.3.  Load-Balancing Node

   If the children of a node are load-balanced, the subsets of targets
   descended from the children must be ordered in a suitable way for
   each instance of sending a message, so that some messages are sent to
   each target.  The practical difficulty is to ensure that the right
   proportion of traffic is sent to the descendants of each child node
   without having to maintain long-term records of the amount of traffic
   that has been sent to each child's descendants.

   A simple way to do this is to generate a new randomized ordering of
   the children for each new message to be processed.  A randomization
   algorithm that achieves the correct traffic distribution is described
   in Appendix A.  For each instance, once the children of the node are
   ordered, they are handled as described above for the children of a
   prioritized node.

5.  Procedure Modifications

   The following modifications are specified for all baseline
   specifications:

5.1.  Permitted to Reorder Targets

   A client MAY send the message to the targets in an order that is not
   permitted by the baseline specifications.


Worley                   Expires August 21, 2017               [Page 15]

Internet-Draft             TBD: Happy Earballs             February 2017


5.2.  Must Preserve Traffic Distribution

   To state the next requirement, we must define what it means to say
   that a target is "responsive".  Intuitively, a target is responsive
   if its response time to a message is not "too much" longer than the
   response time of any other target for the goal.  The responsiveness
   of a target is always defined relative to the set of targets for a
   particular goal; hence, a target may be responsive for one goal at
   the same time that it is not responsive for another goal.

   We define RTT(T) to mean the round-trip response time of a target,
   the time it takes to receive confirmation of the receipt of a message
   sent to the target.  If the target does not respond at all, we
   consider RTT(T) to be "infinity", which is larger than any number.
   Note that RTT(T) is a fact about reality at some instant, not a
   measure of the client's current knowledge about reality at some
   instant.

   We define a function "Limit" that converts one time value into
   another: if RTT(T1) > Limit(RTT(T0)), then target T1 responds "too
   slowly" relative to the response time of target T0, and T1 is
   considered non-responsive.  We define Limit(t) = m*t + f, where "m"
   and "f" are parameters defined below.

   The parameter "m" limits the range of response times that we will
   allow among targets we consider responsive.  We set m to be 2.  (TBD:
   Is this a good choice?)

   The parameter "f" is the length of time that we consider to be
   insignificant when comparing the response time of targets.  We set f
   to be 2*T1, where T1 is the value of that name used in the procedures
   of [RFC3261], which is commonly the round-trip time estimate of the
   relevant network, and defaults to 500ms.  (TBD: Is this a good
   choice?)

   The result is that Limit(t) is "twice t, plus a little more to
   account for the inherent delays in the network".

   A target T0 is defined to be "responsive" if

      its response time, RTT(T0), is less than the relevant timeout
      (often Timer B or Timer F), and

      for any other target T1 of the goal, RTT(T0) < Limit(RTT(T1)).

   TBD: The following wording must be set so that the client can move
   non-responsive targets to any place in the order (or at least, any
   later place in the order) without violating this condition.  It is


Worley                   Expires August 21, 2017               [Page 16]

Internet-Draft             TBD: Happy Earballs             February 2017


   not clear this has been accomplished yet.  Later: I think we've
   accomplished this now.

   We are now ready to specify the major new constraint on the client's
   behavior: The client's procedures MUST, over time, distribute the
   traffic for any particular goal among the responsive targets the same
   proportions as are required by the baseline specifications.
   Specifically,

      If the set of targets for a particular goal does not change over a
      period of time,

      for any two subsets of targets, both of which contain at least one
      target which is responsive for the entire period of time, and

      if the baseline specifications prescribe a proportion between the
      traffic to the two subsets, then

      the proportion between the traffic to the responsive subsets of
      the two subsets MUST converge to the proportion specified by the
      baseline specifications.

5.3.  Address Family Preference

   Unless overridden by user configuration or by network configuration:
   If the host has a policy of preferring one address family, the client
   MUST prefer it.  If the host's policy is unknown or not obtainable,
   the client MUST prefer IPv6 over IPv4.  This usually the client must
   give preference to IPv6 over IPv4.

   This preference MUST have the following effect: Consider the
   "initial" targets, which are the targets which the baseline
   specifications do not prioritize after any other targets.  The client
   must additionally prioritize the initial targets which are of the
   preferred address family before the other initial targets.

   TBD: Is this sufficient?  We don't require address family preference
   to affect non-initial targets.  Alternatively, if the server has a
   lot of IPv6 addresses, none of which are responsive, the only way to
   quickly send to an IPv4 address is to send probes to all of the
   initial IPv6 addresses and one (formerly initial) IPv4 addresses.
   This is recommended in the probing heuristics, but might require a
   lot of probes.


Worley                   Expires August 21, 2017               [Page 17]

Internet-Draft             TBD: Happy Earballs             February 2017


5.4.  Address Selection

   Clients SHOULD provide a mechanism by which the address selection
   configuration [RFC6724] can be customized for the client
   independently of any other application.

   Clients SHOULD implement the destination address selection mechanism
   specified in [RFC6724].  Note that this mechanism provides a priority
   order among the set of A/AAAA records for a single server host name,
   whereas [RFC3263] assumes that such sets of A/AAAA records are
   unordered.

   Clients SHOULD implement rule 5.5 of section 5 of [RFC6724],
   preferring to use a source address with a prefix assigned by the
   selected next-hop.  This requires that the IPv6 stack remembers which
   next-hops advertised which prefixes.

   Clients SHOULD by default use the source address selection mechanism
   specified in [RFC6724], which chooses one source TSAP for any
   particular destination TSAP.

   Clients SHOULD also be configurable to use an alternative mechanism,
   in which for any destination TSAP, targets are generated for each
   source TSAP that could possibly communicate with the destination
   TSAP, with the source TSAP selected by [RFC6724] prioritized over the
   other source TSAPs and the other source TSAPs being unordered among
   themselves.

   The alternative policy is useful in situations where the source
   address selection table prioritizes an interface which does not
   forward SIP traffic to the destination address.  (For an example,
   when the source address selection table routes almost all
   destinations to an organizational VPN which has restricted
   connectivity.)

5.5.  Vias

   A client MUST provide Vias in requests that properly route from the
   server to the client, regardless of the presence of NATs in the
   transportation path.  This is necessary even when the request is sent
   via a connection-oriented transport, because the connection may be
   terminated before the response is sent back to the client and the
   server may need to reestablish a connection.  In general, the client
   SHOULD provide the "rport" parameter on the via-param.  [RFC3581]

   Additionally, to assist tracing and diagnosis, a client SHOULD
   provide the source TSAP that it used in the via-param.  TBD: Is this
   too strict?  Is it useful?


Worley                   Expires August 21, 2017               [Page 18]

Internet-Draft             TBD: Happy Earballs             February 2017


5.6.  DNS Caching

   The information a client uses to determine a target set must be up-
   to-date.  In particular, DNS information MUST NOT be retained longer
   than the TTL as it was last retrieved from DNS, and information
   computed from DNS information MUST NOT be retained longer than the
   TTL of any DNS information used to compute it.  TBD: Should we allow
   a client to cache target set computations somewhat longer than the
   TTL to minimize disruption and DNS traffic?  Phone calls typically
   take 3 minutes, so we could allow 5 minutes grace and thus ensure
   that target sets rarely have to be recomputed during a call.

5.7.  Unused Flows

   Flows that are created as probes but not subsequently used (either to
   send the message or to maintain a SIP Outbound flow) SHOULD be
   terminated, even though they could -- in some cases -- be put to
   reasonable use.  This includes flows that are connection-oriented
   protocols as well as non-connection-oriented flows with security
   associations.  Minimizing the number of unused connections reduces
   the load on the server and on stateful middleboxes.  Also, if the
   abandoned connection is IPv4, this reduces IPv4 address sharing
   contention.

5.8.  Debugging and Troubleshooting

   Happy Earballs is aimed at ensuring a reliable user experience
   regardless of connectivity problems affecting any single transport.
   However, this naturally means that applications employing these
   techniques are by default less useful for diagnosing issues with a
   particular address family.  To assist in that regard, an
   implementation MAY provide a mechanism to disable their Happy
   Earballs behavior via a user setting, and to provide data useful for
   debugging (e.g., a log or way to review current preferences).

6.  Consequences of the New Requirements

   In this section we explore some of the consequences of these
   requirements and describe possible approaches for designing clients
   that satisfy the modified requirements and provide shorter
   transmission latency.

   A client may send the message to the targets in an order that is not
   permitted by the baseline specifications, but it may not omit any
   targets from its ordering.  Thus, the client is required to send it
   to all the targets before it may declare failure of the send process.
   TBD: Would it be better to 408 the message faster?  For example, if
   the client has cached information which indicates that a target is


Worley                   Expires August 21, 2017               [Page 19]

Internet-Draft             TBD: Happy Earballs             February 2017


   unreachable, the client may move that target to the end of the order,
   but if sending to all other targets is unsuccessful, the client must
   send to that target before declaring failure.

   A client may cache measured RTT() values for targets and use this
   information to optimize target orderings.  Because a single target
   may appear in the target set for multiple goals, the client should
   cache RTT(T) for targets (rather than judgments of responsiveness),
   and then when sending to a goal use that value to determine whether
   the target is responsive relative to that particular goal.

   A client may determine a target T1 to be "slow" (relative to a given
   goal) if its cached RTT(T1) is greater than Limit(RTT(T2)) for the
   cached RTT(T2) for some other target T2 in the goal's target set.  A
   target that is not slow is "normal".  Note that a target being slow
   is determined by the client via a combination of the information in
   the cache and the state of the network at the moments that the cached
   information was recorded.  As it were, the client thinks a slow
   target is non-responsive, but the target may or may not actually be
   non-responsive at that moment, depending on whether the cached RTT()
   values agree with current reality.

   If there is be an upper bound on the length of time that a client
   retains cached RTT() values, then the client may assume that any slow
   target is non-responsive, in that it may place the target after the
   normal targets in the order.  For this reason, we assume in this
   document that the client puts an upper bound on the length of time
   that RTT() values remain in the cache, after which they are either
   deleted or replaced by values based on more recent observations of
   the target's behavior.

   The upper bound on the lifetime of cache entries should on the order
   of 10 minutes.  (This parallels [RFC6555].)

   The client may act on its cached RTT() values in this way because it
   will not violate the traffic distribution requirement: If a target is
   responsive over a long period, the eventual delete/refresh of cached
   values from before that period ensures that the client will
   eventually see the target as normal.  (2) If a target is not
   continuously responsive over a long period, the traffic distribution
   requirement places no restriction on whether the client sends traffic
   to it or not, and repositioning it after all normal targets does not
   affect the traffic distribution among the normal targets.

   The length of time that different RTT() values are cached may differ
   from each other.  When the state of a source address is changed, or
   the state of the interface it is assigned to changes, or when the
   network it is connected to is re-initialized, cached RTT() values for


Worley                   Expires August 21, 2017               [Page 20]

Internet-Draft             TBD: Happy Earballs             February 2017


   targets with that source address should be deleted.  Interfaces can
   determine network re-initialization by a variety of mechanisms (e.g.,
   [RFC4436] and [RFC6059]).

   When a client processes a message, the ordering of targets that it
   sends to must be an ordering permitted by the baseline
   specifications, with the exception that slow targets may be moved
   after all normal targets.  Note that any randomization of target
   groups to implement load balancing will be reflected among the normal
   targets in the client's ordering.

   Since the client's goal is to deliver the message as quickly as
   possible, a client should always move slow targets to the end of the
   order, after the normal targets.  Note that if a probe transmission
   during message processing discovers a target to be slow, the target
   can be moved at that time to after all normal targets.

   A client obtains RTT(T) for a target T whenever it sends to the
   target.  But it can also obtain RTT(T) by a probe, which is any
   transmission to T which requires a response but does not involve
   sending the message (and hence does not commit the client to possibly
   waiting for a timeout before sending to another target).  A client
   may send probes to several targets simultaneously.

   Probe operations include:

   o  establishing a transport connection to the target (without sending
      the message as data on the connection)

   o  sending a keep-alive, as specified by the transport protocol

   o  sending a CR-LF-CR-LF keep-alive on a SIP Outbound flow [RFC5626]

   o  sending a STUN keep-alive message on a SIP Outbound flow [RFC5626]

   o  sending an OPTIONS request with "Max-Forwards: 0".

   (Note that a probe using an OPTIONS request can be used with any
   protocol.  If the OPTIONS reaches the target, the target is required
   to respond with either a 200 or 483 response [RFC3261] without
   forwarding it to another entity.  Conveniently, a server can respond
   to such a request statelessly, so such requests are low-overhead.
   (Although the SIP Outbound keep-alive methods have even lower
   overhead.))

   Similarly, if a client has a connection to a target T, and the
   connection has been idle for long enough, the client will not have a
   cached RTT(T) for T, reflecting the fact that the connection may have


Worley                   Expires August 21, 2017               [Page 21]

Internet-Draft             TBD: Happy Earballs             February 2017


   failed without the client's knowledge.  The client can refresh the
   cached RTT(T) by performing a probe operation within the connection.

   A flow or connection that is established to a target should be
   preferred over establishing a new flow or connection to that target
   for sending either a probe or a message.  TBD: However, we want to
   broaden this to cover all flows that are to the same actual host, but
   how do we define that condition?  Conversely, this mustn't override
   prioritization.

   If the client initiates a probe of a target T, it may be able to
   decide that T is slow without waiting to determine the actual value
   of RTT(T), which may take as long as the timeout period, because the
   true value of RTT(T) is always at least as large as the elapsed time
   since the probe was sent, and the determination of slowness depends
   on whether RTT(T) exceeds Limit(Tfastest) (where Tfastest is the
   target with smallest RTT() value of any target in the target set).

   If the client establishes a connection to a target without
   simultaneously sending the message, the connection establishment is a
   probe of the target, and after initiating the connection but before
   sending the message, if the probe reveals that the target is slow,
   the client may move that target later in the ordering, and turn its
   attention to another target.

   In order to minimize the chance that the client must wait for a
   timeout before sending to another target, the client may send probes
   to targets, and the RTT() values revealed by those probes can change
   what target the client will send to next.  Because of this, the
   client's procedures do not simply convert the tree of targets into an
   ordering of the targets, which the client then follows -- Information
   discovered during the sequence of sends can affect the order targets
   are sent to.

   If the client maintains maximum flexibility, instead of representing
   the target tree and its traffic restrictions as a single order, it
   represents them as the graph described above.  At any time, the
   client can send probes to any or all of the targets (presumably, ones
   for which it does not have cached RTT() values).  If there is no
   outstanding sent message (no message has been sent which has not
   timed out), the client can choose one target to sent to, a target for
   which all targets connected to it to the left in the graph have
   already been sent to.  The design space of Happy Earballs solutions
   is choosing which target should next be sent to and choosing when to
   send probes.


Worley                   Expires August 21, 2017               [Page 22]

Internet-Draft             TBD: Happy Earballs             February 2017


   There is little value sending a probe to a target unless all targets
   of higher priority (1) have been sent the message (and failed), (2)
   have been sent a probe, or (3) have a cached RTT() value.

   A target T0 is "quick" if its cached RTT(T0) is less than Limit(0),
   that is, if T0 will be normal regardless of the RTT() values of any
   other target.  If an allowed next target has a cached RTT() value,
   and it is "quick", then it is never slow for any goal.

   If the client has reason to believe that it will soon be asked to
   send a message to a goal with a target T, and if the cached RTT(T) is
   likely to expire before then, it may decide to refresh the cached
   value by probing T.

   Similarly, a client may be in a situation where it has advance notice
   that it is likely to need to send a message to a particular target,
   for instance, if the user of a UA begins dialing an outgoing call
   which will be routed through a particular outgoing proxy.  In such a
   situation, the client should consider preemptively probing the
   target.

   Note that the use of probes increases the non-message traffic to the
   targets, and thus has a cost.  A client minimizes the expected
   transmission time by initially probing all of the targets, but that
   strategy maximizes the additional traffic.  A client should weigh the
   tradeoff between improved user experience and increased traffic.  In
   particular, the client should be aware of which messages require
   rapid service for good user experience (e.g., INVITE and BYE) and
   which do not (e.g., REGISTER and re-SUBSCRIBE).

   A client should avoid sending to a target which does not have a
   cached RTT() value (unless it is the last remaining target), because
   the target might be non-responsive forcing the client to wait for a
   timeout.  Instead, the client should probe the target first.

7.  Examples

   In this section, we show some ways that clients can handle situations
   involving various combinations of targets with particular properties
   in order to provide a good user experience.  In this section, we will
   annotate graphs by adding to targets attributes like "(cached)" (has
   RTT() cached), "(slow)", etc.

7.1.  Two Unordered Targets, Both Cached

   Suppose there are two unordered targets, both of which have cached
   RTT() values:


Worley                   Expires August 21, 2017               [Page 23]

Internet-Draft             TBD: Happy Earballs             February 2017


         A (cached)
        /          \
       *            *
        \          /
         B (cached)

   As is shown by the graph, the client could send to either target
   first, because no target has a preceding target, and both have cached
   RTT() values.  Optimally, the client will send to the target with the
   smallest RTT() value, which we will assume is A.

   In the unlikely case that sending to A fails, it can be deleted from
   the graph to show the remaining possibilities:

       *            *
        \          /
         B (cached)

   From this graph, the client must choose B to send to.

7.2.  Two Unordered Targets, One Cached

   Suppose there are two unordered targets, only one of which has a
   cached RTT() value:

         A (cached)
        /          \
       *            *
        \          /
             B

   As is shown by the graph, the client is allowed to send to either
   target first, because no target has a preceding target.  If RTT(A) is
   small enough, the client may choose to send to A immediately.  But it
   might be worth the client's effort to send a probe to B, and if the
   probe returns quickly enough, the client may choose to send to B
   first.

7.3.  Two Unordered Targets, Neither Cached

   Suppose there are two unordered targets, neither of which have cached
   RTT() values:

         A
        / \
       *   *
        \ /
         B


Worley                   Expires August 21, 2017               [Page 24]

Internet-Draft             TBD: Happy Earballs             February 2017


   As is shown by the graph, the client could send to either target
   first.  But the client does not know whether either target is
   responsive, and thus sending to either of them risks waiting for a
   timeout.  Instead, the client should send probes to both targets.
   When the first probe returns, the graph is changed to indicate that
   one target is cached:

         A (cached)
        /          \
       *            *
        \          /
             B

   After this state change, the client will send to A.

   In the unlikely case that sending to A fails, A is deleted from the
   graph, and the client can send to B without further delay (since B is
   the only remaining target).

   This situation parallels the standard "Happy Eyeballs" situation in
   HTTP, where the client has two (or more) unordered addresses for the
   server, one IPv4 and one IPv6.  The client requests connections with
   both addresses simultaneously, and the first connection that succeeds
   is used to send the HTTP request.  [RFC6555]

7.4.  Two Prioritized Targets, Both Cached

   Suppose there are two prioritized targets, both of which have cached
   RTT() values:

       *----A (cached)----B (cached)----*

   As is shown by the graph, the baseline specification is that the
   client must send to A first, and then if that fails, sent to B.  If
   the RTT() values make both targets normal, the client must follow
   that sequence.

   However, it's possible that A is slow because RTT(A) > Limit(RTT(B)),
   in which case A can be moved after the normal targets (that is, B):

       *----B (cached)----*----A (cached) (slow)----*

   At this point, there are no targets before B and B has a cached
   RTT(), so the client sends to B.


Worley                   Expires August 21, 2017               [Page 25]

Internet-Draft             TBD: Happy Earballs             February 2017


7.5.  Two Prioritized Targets, the Second Cached

   Suppose there are two prioritized targets, but only the second has a
   cached RTT() value:

       *----A----B (cached)----*

   The client should send a probe to A.  If the probe response is fast
   enough, the client is required to send to A.  But after Limit(RTT(B))
   elapses (or the relevant timeout), the client knows that A is slow
   and can move it to after the normal targets:

       *----B (cached)----*----A (cached) (slow)----*

   At that point, the client sends to B.

7.6.  Two Prioritized Targets, the First Cached

   Suppose there are two prioritized targets, but only the first has a
   cached RTT() value:

       *----A (cached)----B ----*

   If RTT(A) is <= Limit(0), the client must send to A (since it's
   quick), and so the client should send to A immediately.  But if
   RTT(A) is large enough that there is a reasonable chance that RTT(B)
   is smaller than Limit(RTT(A)), which would make A slow, the client
   can send a probe to B first.  If the probe response returns quickly
   enough, the client then knows that A is slow, can postpone it, and
   send to B.  If the probe does not return quickly enough to make A
   slow, the client sends to A.

7.7.  Three Targets

   A more complex case could arise when the client must choose between
   three source addresses for a destination address.  One, the address
   selected by the source address selection rules, is prioritized and
   the other two are unordered:

                   B
                  / \
       *----A----*   *
                  \ /
                   C

   Combining the heuristics shown in the previous examples, we can see
   that the client should start by probing A, and unless it is being
   conservative regarding probes, it should simultaneously probe B and


Worley                   Expires August 21, 2017               [Page 26]

Internet-Draft             TBD: Happy Earballs             February 2017


   C.  As the probe responses arrive, RTT() values are measured.  If any
   of the targets are revealed as slow, they should be moved to the end
   of the order.  Note that a target can be revealed as slow even if the
   probe has not yet returned, as the elapsed time since the probe was
   sent is a minimum on RTT().

   As the probe responses come in, the client watches to see when A (the
   only target that can be sent to first) becomes verified as reachable,
   which changes the graph to:

                            B
                           / \
       *----A (cached)----*   *
                           \ /
                            C

   At that point, the client should send to A.

   If A is revealed as slow, it is moved to the end of the order,
   leaving B and C available to be sent to.  (If A is sent to but that
   fails, then A is removed from the graph entirely, leaving a similar
   graph but without A.)

         B
        / \
       *   *----A (cached)----*
        \ /
         C

   When there are two targets that might be sent to, the client uses
   heuristics like the ones discussed in Section 7.1 to choose between
   them.

8.  Heuristics

   Generally, clients will operate on heuristics like the following.
   These heuristics operate on a dynamic data structure, a directed
   acyclic graph, which implements the graphs discussed above.

      The graph is initialized with the targets and prioritizations
      (traffic restrictions) specified by the baseline specifications
      for the goal.

      The client may be sending the message to only one target at any
      one time, which must be an initial target and must have a cached
      RTT() value (unless it is the only remaining target).


Worley                   Expires August 21, 2017               [Page 27]

Internet-Draft             TBD: Happy Earballs             February 2017


      If a send to a target fails, the target is removed from the graph.
      This may cause other targets to become initial.

      If a target is revealed to be slow, either because its RTT()
      exceeds Limit(RTT(T)) for some other target T, or because a probe
      to it has been outstanding for longer than Limit(RTT(T)), it is
      moved to the end (right side) of the graph.  This may cause other
      targets to become initial.

      If an initial target has a cached RTT(), the client sends to it
      immediately.  If there are multiple targets with cached RTT(), the
      client sends to the one with the lowest RTT().

      If no initial target has a cached RTT(), the client should probe
      some or all of the initial targets.

      The client may also probe some of the targets that are not
      initial, but should never probe a target unless all prior targets
      either have cached RTT() values or probes in progress.

      The client should balance the value of information to be obtained
      by probing targets with the cost of doing so.

      When the client selects targets to be probed, the probed targets,
      considered with the targets with cached RTT(), should be selected
      to have maximum diversity of network paths, covering the range of
      interfaces and address families in the target set.

      If a target T is initial but has an "unusually" large RTT(T)
      value, the client may postpone sending to the target in order to
      send probes to other targets which might respond faster.
      (Suitable targets may be either other initial targets, or targets
      which are prioritized after only T, and which would become initial
      if T was found to be slow.)

8.1.  A Simplified Method

   Instead of maintaining a directed acyclic graph to control the
   client's operation, the client can replace the graph with a sequence
   of sets of targets based on their "rank".  The rank of a target is
   defined as:

      The rank of an initial target is 0.

      The rank of a non-initial target is 1 more than the highest rank
      of any target that it is prioritized after.


Worley                   Expires August 21, 2017               [Page 28]

Internet-Draft             TBD: Happy Earballs             February 2017


   Thus, a target is only prioritized after targets with lower ranks.
   As processing progresses, all targets in the lowest still-non-empty
   rank are initial, and all targets in higher ranks are non-initial.

   For example, consider a server with two addresses, IPv6 and IPv4,
   with IPv6 prioritized via SRV records.  Both addresses accept both
   TCP and UDP traffic:

       _sip._udp.example.com.    SRV    1 1 5060 sip1.example.com
       _sip._udp.example.com.    SRV    2 1 5060 sip2.example.com
       _sip._tcp.example.com.    SRV    1 1 5060 sip1.example.com
       _sip._tcp.example.com.    SRV    2 1 5060 sip2.example.com
       sip1.example.com.         AAAA   2001:DB8::1
       sip2.example.com.         A      192.0.2.1

   The tree of targets is:

                                       |
                      -------------unordered-----------
                      |                               |
              -----priority-----              -----priority-----
              |                |              |                |
       TCP 2001:DB8::1  TCP 192.0.2.1  UDP 2001:DB8::1  UDP 192.0.2.1

   The graph, annotating each target with its rank, is:

         (0) TCP 2001:DB8::1----(1) TCP 192.0.2.1
        /                                        \
       *                                          *
        \                                        /
         (0) UDP 2001:DB8::1----(1) UDP 192.0.2.1

   Which can be turned into a list of lists as:

       rank 0: TCP 2001:DB8::1, UDP 2001:DB8::1

       rank 1: TCP 192.0.2.1, UDP 192.0.2.1

   The rank representation is functionally equivalent to the following
   graph, which is the original graph with additional lines, showing
   that the rank representation constrains the client's behavior more
   than the original graph does:

         (0) TCP 2001:DB8::1   (1) TCP 192.0.2.1
        /                   \ /                 \
       *                     *                   *
        \                   / \                 /
         (0) UDP 2001:DB8::1   (1) UDP 192.0.2.1


Worley                   Expires August 21, 2017               [Page 29]

Internet-Draft             TBD: Happy Earballs             February 2017


   The rank lists can be built without first constructing the graph by
   walking the target tree from left to right (highest priority to
   lowest priority), with each node passing downward MRdown, the minimum
   rank any descendant target is allowed, and each node passing upward
   MRup, the minimum rank allowed for any target prioritized after that
   node.

   The root node's MRdown is 0.

   For an unordered node:

      Each child's MRdown is the node's MRdown.

      The node's MRup is the maximum of the node's MRdown and all of its
      children's MRup's.

   For a prioritized node (with the children ordered by priority):

      The first child's MRdown is the node's MRdown.

      A later child's MRdown is the preceding child's MRup.

      The node's MRup is the final child's MRup, or if there are no
      children, the node's MRdown.

   For a load-balancing node, the children are first prioritized
   randomly ([RFC2782] and Appendix A), then processed as for a
   prioritized node.

   For a target node:

      The target has rank MRdown.

      The node's MRup is MRdown + 1.

   Here is the preceding example's tree, with each node annotated with
   its MRdown on the left of the node label and its MRup on the right of
   the node label:

                                      |
                     ---------(0) unordered (2)-------
                     |                               |
             -(0) priority (2)-              -(0) priority (2)-
             |                |              |                |
  (0) TCP 2001:DB8::1 (1)     |   (0) UDP 2001:DB8::1 (1)     |
                              |                               |
                   (1) TCP 192.0.2.1 (2)           (1) UDP 192.0.2.1 (2)


Worley                   Expires August 21, 2017               [Page 30]

Internet-Draft             TBD: Happy Earballs             February 2017


   Note that the target tree does not have to be explicitly constructed;
   it can be implicitly walked by a series of function calls, with the
   functions passing MRdown and MRup values between themselves, and each
   target being inserted into the rank list-of-lists as it is generated.

   The address family preference rule Section 5.3 can be implemented
   within the rank representation by first constructing the ranks based
   on the baseline specifications, and then splitting rank 0 into two
   ordered sub-ranks, 0.0 and 0.1, with 0.0 containing all rank 0
   targets of the preferred address family and rank 0.1 containing all
   other rank 0 targets.

   An example of address family preference processing is the ordinary
   case of two prioritized servers each with an IPv6 and IPv4 address:

       _sip._udp.example.com.    SRV    1 1 5060 sip1.example.com
       _sip._udp.example.com.    SRV    2 1 5060 sip2.example.com
       sip1.example.com.         AAAA   2001:DB8::1
       sip1.example.com.         A      192.0.2.1
       sip2.example.com.         AAAA   2001:DB8::2
       sip2.example.com.         A      192.0.2.2

                                       |
                      ----------(0) priority (2)--------
                      |                                |
             -(0) unordered (1)-              -(1) unordered (2)-
             |                 |              |                 |
    (0) 2001:DB8::1 (1)        |   (1) 2001:DB8::2 (2)          |
                               |                                |
                      (0) 192.0.2.1 (1)                (1) 192.0.2.2 (2)

   After splitting rank 0 based on the address family preference, the
   the ranks are:

       rank 0.0: 2001:DB8::1

       rank 0.1: 192.0.2.1

       rank 1: 2001:DB8::2, 192.0.2.2

9.  Security Considerations

   This document changes the order in which a client will send to
   targets but does not change the set of targets that it will send to.
   There are no known SIP systems whose security depends on the order in
   which a client sends to targets.  Given that network connectivity is
   unreliable, it is unlikely that the security of any SIP system
   depends on the ordering of targets.


Worley                   Expires August 21, 2017               [Page 31]

Internet-Draft             TBD: Happy Earballs             February 2017


   The specific security vulnerabilities, attacks and threat models of
   the various protocols mentioned in this document (SIP, DNS, SRV
   records, etc.) are well-documented in their respective
   specifications, and their effect on the security of SIP systems is
   unchanged.

10.  IANA Considerations

   This document does not require any actions by IANA.

11.  History

   Note to RFC Editor: Upon publication, remove this section.

11.1.  Changes from draft-worley-sip-he-connection-01 to draft-worley-
       sip-happy-earballs-00

   Complete overhaul.

   Changed "EarBalls" to "Earballs".

11.2.  Changes from draft-worley-sip-he-connection-00 to draft-worley-
       sip-he-connection-01

   Minor changes.

   Add note that WebSocket is out of scope, because there is only one
   possible transport in WebSocket.

11.3.  Changes from draft-johansson-sip-he-connection-01 to draft-
       worley-sip-he-connection-00

   This version has a different name for technical reasons.  It is, in
   reality, the successor to draft-johansson-sip-he-connection-01.

   Move Acknowledgments after References, as that is the style the
   Editor prefers.

   Updated Security Considerations: This increment of the H.E. work does
   not make normative changes in existing SIP.

   Copy a lot of text from RFC 6555, as this I-D is parallel to RFC
   6555.

   Changed "hostname" to "host name", as the latter form is more common
   in RFCs by a moderate margin.


Worley                   Expires August 21, 2017               [Page 32]

Internet-Draft             TBD: Happy Earballs             February 2017


   Revised some of the introduction text to parallel the introduction of
   RFC 7984.

   Changed name of algorithm to "Happy EarBalls", added reference to
   Urban Dictionary.

   Many expansions of the discussion and revisions of the wording.

12.  References

12.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <http://www.rfc-editor.org/info/rfc2119>.

   [RFC2782]  Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for
              specifying the location of services (DNS SRV)", RFC 2782,
              DOI 10.17487/RFC2782, February 2000,
              <http://www.rfc-editor.org/info/rfc2782>.

   [RFC3261]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
              A., Peterson, J., Sparks, R., Handley, M., and E.
              Schooler, "SIP: Session Initiation Protocol", RFC 3261,
              DOI 10.17487/RFC3261, June 2002,
              <http://www.rfc-editor.org/info/rfc3261>.

   [RFC3263]  Rosenberg, J. and H. Schulzrinne, "Session Initiation
              Protocol (SIP): Locating SIP Servers", RFC 3263,
              DOI 10.17487/RFC3263, June 2002,
              <http://www.rfc-editor.org/info/rfc3263>.

   [RFC3581]  Rosenberg, J. and H. Schulzrinne, "An Extension to the
              Session Initiation Protocol (SIP) for Symmetric Response
              Routing", RFC 3581, DOI 10.17487/RFC3581, August 2003,
              <http://www.rfc-editor.org/info/rfc3581>.

   [RFC6555]  Wing, D. and A. Yourtchenko, "Happy Eyeballs: Success with
              Dual-Stack Hosts", RFC 6555, DOI 10.17487/RFC6555, April
              2012, <http://www.rfc-editor.org/info/rfc6555>.

   [RFC6724]  Thaler, D., Ed., Draves, R., Matsumoto, A., and T. Chown,
              "Default Address Selection for Internet Protocol Version 6
              (IPv6)", RFC 6724, DOI 10.17487/RFC6724, September 2012,
              <http://www.rfc-editor.org/info/rfc6724>.


Worley                   Expires August 21, 2017               [Page 33]

Internet-Draft             TBD: Happy Earballs             February 2017


   [RFC7984]  Johansson, O., Salgueiro, G., Gurbani, V., and D. Worley,
              Ed., "Locating Session Initiation Protocol (SIP) Servers
              in a Dual-Stack IP Network", RFC 7984,
              DOI 10.17487/RFC7984, September 2016,
              <http://www.rfc-editor.org/info/rfc7984>.

12.2.  Informative References

   [I-D.johansson-sip-he-connection]
              Johansson, O., Salgueiro, G., and D. Worley, "Setting up a
              SIP (Session Initiation Protocol) connection in a dual
              stack network using connection oriented transports",
              draft-johansson-sip-he-connection-01 (work in progress),
              October 2016.

   [RFC3319]  Schulzrinne, H. and B. Volz, "Dynamic Host Configuration
              Protocol (DHCPv6) Options for Session Initiation Protocol
              (SIP) Servers", RFC 3319, DOI 10.17487/RFC3319, July 2003,
              <http://www.rfc-editor.org/info/rfc3319>.

   [RFC3361]  Schulzrinne, H., "Dynamic Host Configuration Protocol
              (DHCP-for-IPv4) Option for Session Initiation Protocol
              (SIP) Servers", RFC 3361, DOI 10.17487/RFC3361, August
              2002, <http://www.rfc-editor.org/info/rfc3361>.

   [RFC4213]  Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms
              for IPv6 Hosts and Routers", RFC 4213,
              DOI 10.17487/RFC4213, October 2005,
              <http://www.rfc-editor.org/info/rfc4213>.

   [RFC4436]  Aboba, B., Carlson, J., and S. Cheshire, "Detecting
              Network Attachment in IPv4 (DNAv4)", RFC 4436,
              DOI 10.17487/RFC4436, March 2006,
              <http://www.rfc-editor.org/info/rfc4436>.

   [RFC5626]  Jennings, C., Ed., Mahy, R., Ed., and F. Audet, Ed.,
              "Managing Client-Initiated Connections in the Session
              Initiation Protocol (SIP)", RFC 5626,
              DOI 10.17487/RFC5626, October 2009,
              <http://www.rfc-editor.org/info/rfc5626>.

   [RFC6059]  Krishnan, S. and G. Daley, "Simple Procedures for
              Detecting Network Attachment in IPv6", RFC 6059,
              DOI 10.17487/RFC6059, November 2010,
              <http://www.rfc-editor.org/info/rfc6059>.


Worley                   Expires August 21, 2017               [Page 34]

Internet-Draft             TBD: Happy Earballs             February 2017


   [UD]       "The Jews Who Stole Christmas", , "Urban Dictionary, entry
              'Earballs'", December 2011,
              <http://www.urbandictionary.com/define.php?term=Earballs>.

Appendix A.  Implementing Load Balancing

   Load-balancing is specified by the "weight" field of DNS SRV records.
   The defining algorithm is specified in [RFC2782].  The same result
   can be obtained with a simpler algorithm: For each server, calculate
   a "score": If its weight is 0, its score is "infinity" (in practice,
   100 suffices).  If its weight is non-zero, its score is calculated by
   choosing a random number between 0 and 1, taking the negative of the
   logarithm of that number, and dividing the result by the weight.
   (Thus, the score is always a positive number.)  (The resulting score
   has an exponential distribution whose parameter is the weight.)
   Then, sort the servers into order of increasing scores, so that the
   servers with the smallest scores are used first.

   This alternative algorithm is analyzed and sample implementations are
   provided in the files in the directory
   sipXrouter/sipXtackLib/doc/developer/scores in the GitHub Sipfoundry
   project (https://github.com/sipfoundry/sipXrouter), among other
   repositories.

Acknowledgments

   TBD:

   The authors would like to acknowledge the support and contribution of
   the SIP Forum IPv6 Working Group.  This document is based on a lot of
   tests and discussions at SIPit events, organized by the SIP Forum.

   The foundation of this document is the work done by Olle Johansson
   and Gonzalo Salgueiro in earlier documents, including
   [I-D.johansson-sip-he-connection].  In turn, the foundation of that
   work is [RFC6555], whose authors are Dan Wing and Andrew Yourtchenko.

   Scott O.  Bradner suggested that the formula for determining
   responsiveness should contain a constant term.

   Roman Shpount described the need for configuration to override the
   source address selection mechanism.

   Tolga Asveren suggested requiring "rport".


Worley                   Expires August 21, 2017               [Page 35]

Internet-Draft             TBD: Happy Earballs             February 2017


Author's Address

   Dale R. Worley
   Ariadne Internet Services
   738 Main St.
   Waltham, MA  02451
   US

   Email: worley@ariadne.com


Worley                   Expires August 21, 2017               [Page 36]