Network Working Group                                         S. Schuetz
Internet-Draft                                                 L. Eggert
Expires: December 2, 2006                                            NEC
                                                                 W. Eddy
                                                                 Verizon
                                                                Y. Swami
                                                                   K. Le
                                                                   Nokia
                                                            May 31, 2006


      TCP Response to Lower-Layer Connectivity-Change Indications
                     draft-schuetz-tcpm-tcp-rlci-00

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.
   This document may not be modified, and derivative works of it may not
   be created, except to publish it as an RFC and to translate it into
   languages other than English.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on December 2, 2006.

Copyright Notice

   Copyright (C) The Internet Society (2006).

Abstract


Schuetz, et al.         Expires December 2, 2006                [Page 1]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   When connectivity characteristics between two hosts change abruptly,
   TCP can experience significant delays before resuming transmission in
   an efficient manner or TCP can behave unfairly to competing traffic.
   This document describes TCP extensions that improve transmission
   behavior in response to advisory, lower-layer connectivity-change
   indications.  The proposed TCP extensions modify the local behavior
   of TCP and introduce a new TCP option to signal local connectivity-
   change indications to remote peers.  Performance gains result from a
   more efficient transmission behavior and are not due to an increased
   aggressiveness.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Motivation and Overview  . . . . . . . . . . . . . . . . . . .  4
   3.  Background: Classification of Connectivity Disruptions . . . .  5
     3.1.  Short Connectivity Disruptions . . . . . . . . . . . . . .  6
     3.2.  Long Connectivity Disruptions  . . . . . . . . . . . . . .  8
   4.  Connectivity-Change Indications  . . . . . . . . . . . . . . . 10
   5.  TCP Response to Connectivity-Change Indications  . . . . . . . 11
     5.1.  Connectivity-Change Indication TCP Option  . . . . . . . . 12
     5.2.  Re-Probing Path Characteristics  . . . . . . . . . . . . . 14
     5.3.  Speculative Retransmission . . . . . . . . . . . . . . . . 15
   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 15
   7.  Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 16
   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 16
   9.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 16
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17
     10.1. Normative References . . . . . . . . . . . . . . . . . . . 17
     10.2. Informative References . . . . . . . . . . . . . . . . . . 17
   Editorial Comments . . . . . . . . . . . . . . . . . . . . . . . .
   Appendix A.  Document Revision History . . . . . . . . . . . . . . 20
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21
   Intellectual Property and Copyright Statements . . . . . . . . . . 23


Schuetz, et al.         Expires December 2, 2006                [Page 2]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


1.  Introduction

   Several current components of Transmission Control Protocol (TCP)
   [RFC0793] assume that end-to-end paths between hosts are relatively
   stable over the lifetime of a connection.  Although the TCP
   congestion control algorithms [RFC2581] adapt to changes in path
   connectivity characteristics between two hosts over time, they cannot
   adapt well if significant changes occur on time-scales of a few
   round-trip times or less.  This is due to the granularity of TCP's
   sampling mechanisms.  Significant changes to path connectivity
   include loss or reestablishment of connectivity, and drastic, abrupt
   changes to the round-trip time (RTT) or available bandwidth.
   Connectivity changes that occur on short time-scales are becoming
   more common, due to host mobility or intermittent network attachment.

   This document describes a set of complementary TCP extensions that
   improve behavior when path characteristics change on short time-
   scales.  TCP implementations that support the proposed extensions
   respond to receiving generic, technology-independent, per-connection
   "path characteristics have changed" (or short: "connectivity-change")
   indications from lower layers.  A connectivity-change indication
   signals that the connectivity characteristics of the end-to-end path
   between the local node and its peer have changed in an undefined way.
   The response mechanisms proposed for TCP act on this information in a
   conservative fashion.  The specific response depends on the state of
   a connection.

   It is important to note that TCP and other transport protocols
   already react to information and signals from lower layers; the
   proposed connectivity-change indications thus extend an established
   interface between layers in the protocol stack.  TCP measures the
   end-to-end path to implicitly derive network-layer information.  TCP
   also directly reacts to network-layer signals delivered via ICMP, for
   example, "Port Unreachable" or the now-deprecated "Source Quench"
   [RFC1122].  Explicit Congestion Notification (ECN) [RFC3168] and
   Quick-Start [I-D.ietf-tsvwg-quickstart] are other sources of network-
   layer information for which response mechanisms for TCP have been
   proposed.  Connectivity-change indications are yet another source of
   lower-layer information that TCP can use to improve its operation.

   A second important point to note is that the proposed TCP response
   mechanisms to connectivity-change indications are purely optional
   efficiency improvements.  In the absence of connectivity-change
   indications, a TCP that implements the proposed changes behaves
   identical to an unmodified TCP.  When lower layers provide
   connectivity-change indications that trigger the proposed
   enhancements, they enhance TCP operation based on the explicit lower-
   layer information that is signaled.  The proposed response mechanisms


Schuetz, et al.         Expires December 2, 2006                [Page 3]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   do not increase the aggressiveness of TCP.

   Note that the IAB has recently described architectural issues of
   "link indications" [I-D.iab-link-indications].  The authors feel that
   this term is not quite accurate in this environment, because
   transport mechanisms should remain link-technology-agnostic.
   However, transport protocols have always acted on network-layer
   information and signals, such as measured path characteristics or
   ICMP-signaled conditions.  Because of the growing proliferation of
   shim layers between the traditional network and transport layers,
   this document uses the term "lower-layer indication" to remain
   independent of specific network or shim layers.

   Note that it is currently an open question as to whether additional
   lower-layer indications can provide further information to transport
   protocols.  Also, this document focuses on response mechanisms for
   TCP only, although other transport protocols may benefit from similar
   response mechanisms that react to these indications.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].


2.  Motivation and Overview

   Several proposed network layer extensions support host mobility,
   including Mobile IPv4 [RFC3344], Mobile IPv6 [RFC3775] and HIP
   [I-D.ietf-hip-mm].  Typically, they shield transport-layer protocols
   from mobility events and enable them to sustain established
   connections across mobility events.  However, the path
   characteristics that established connections experience after a
   mobility event may have changed drastically and on short time-scales.
   Congestion control, RTT and path-MTU state gathered over an old path
   before the move generally have no meaning when transmitting along a
   new path.

   TCP already forces a slow-start restart in some cases where the
   network state becomes unknown, such as after an idle period or heavy
   losses.  One mechanism proposed in this document introduces a similar
   slow-start restart in response to connectivity-change indications
   that are received while a connection is in steady-state.  Note that
   this behavior is more conservative than the standard TCP response;
   any performance gains with the proposed mechanisms are due to
   avoiding to overload the new path.

   A second proposed extension improves TCP operation in the presence of
   temporary connectivity disruptions.  These disruptions can occur


Schuetz, et al.         Expires December 2, 2006                [Page 4]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   independently of mobility events and may, for example, be due to
   insufficient wireless access coverage or nomadic computer use.
   Connectivity disruptions can severely decrease TCP performance.  The
   main reason for this decrease is TCP's retransmission behavior after
   a connectivity disruption [SCHUETZ-CCR], i.e., periodic
   retransmission attempts in exponentially increasing intervals, which
   can unnecessarily delay retransmissions after connectivity returns.
   In the extreme case, TCP connections can even abort, if the
   disruption is longer than the TCP "user timeout."  (Connection aborts
   are out of scope for this document but can be prevented by the TCP
   User Timeout Option [I-D.ietf-tcpm-tcp-uto].)

   The proposed response mechanism is also executed when receiving a
   connectivity-change indication, but is chosen when a connection is
   stalled in exponential back-off.  It improves TCP retransmission
   behavior after connectivity is restored through an immediate
   speculative retransmission attempt [anchor3].  Similar to the first
   extension, this modification increases TCP performance through a more
   intelligent transmission behavior that uses periods of connectivity
   more efficiently.  It does not cause significant amounts of
   additional traffic and does not change TCP's congestion control
   algorithms.

   Finally, this draft proposes a third mechanism, which is a new TCP
   option that signals connectivity-change indications received or
   detected by a host to its remote peers in open TCP connections.  This
   is useful, because connectivity indications typically require
   appropriate responses at both peers, but may only be received or
   detected by one peer.  Response to a connectivity-change indication
   is independent of its source (locally notified or remotely signaled)
   and depends only on the specific indication and the state of the
   connection for which it was received.


3.  Background: Classification of Connectivity Disruptions

   Connectivity disruptions occur in many different situations.  They
   can be due to wireless interference, movement out of a wireless
   coverage area, switching between access networks, or simply due to
   unplugging an Ethernet cable.  Depending on the situation in which
   they occur, the implications of connectivity disruptions are
   different and must be handled appropriately.  This section attempts
   to classify different types of connectivity disruptions and discusses
   their implications and impact on TCP.

   Two main properties of connectivity disruptions affect how TCP reacts
   on them: their duration and whether the path characteristics have
   significantly changed after they end.  This document distinguishes


Schuetz, et al.         Expires December 2, 2006                [Page 5]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   between "short" and "long" disruptions and "changed" and "unchanged"
   path characteristics.  Note that these two categories are orthogonal
   to each other, i.e., all four combinations exist.

   Connectivity disruptions are "short" for a given TCP connection, if
   connectivity returns before the RTO fires for the first time.  In
   this case, standard TCP recovers lost data segments through Fast
   Retransmit and lost ACKs through successfully delivered later ACKs.
   Section 3.1 briefly describes this case.

   Connectivity disruptions are "long" for a given TCP connection, if
   the RTO fires at least once before connectivity returns.  In this
   case, TCP can be inefficient in its retransmission scheme, as
   described in Section 3.2.

   Whether or not path characteristics change when connectivity returns
   is a second important factor for TCP's retransmission scheme.
   Standard TCP implicitly assumes that path characteristics remain
   unchanged for short disruptions by performing Fast Retransmit based
   on path parameters collected before the disruption.  For long
   disruptions, standard TCP is more conservative and performs slow-
   start, re-probing the path characteristics from scratch.  However,
   the standard behavior can be inefficient.

   These implicit assumptions can cause standard TCP to misbehave or
   perform inefficiently in some scenarios.  Figure 1 illustrates the
   standard TCP behavior.

                   +----------------------+----------------------+
          Short    | Fast Retransmit      | Fast Retransmit      |
          Duration | using collected path | using collected path |
          < RTO    | characteristics      | characteristics      |
                   +----------------------+----------------------+
          Long     |                      |                      |
          Duration | Slow-start           | Slow-start           |
          >= RTO   |                      |                      |
                   +----------------------+----------------------+
                       Unchanged Path         Changed Path
                       Characteristics        Characteristics


   Figure 1: Standard TCP behavior.

3.1.  Short Connectivity Disruptions

   One common cause of short connectivity disruptions that result in a
   change of the end-to-end path characteristics is transparent network
   layer mobility, via protocols such as Mobile IP, NEMO, or HIP.


Schuetz, et al.         Expires December 2, 2006                [Page 6]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   Although changes in the point of network attachment happen
   unbeknownst to the transport layer, these events may change many
   aspects of the path which established TCP connections base their
   behavior upon.

   Consider a MobileIP scenario as shown in Figure 2.  At time T, a
   mobile node MN is attached to access network Net-1, connected to the
   Internet through access router AR-1 and has the care-of address
   <Net-1, MN>.  A TCP connection is established between MN and a
   corresponding node CN.  While MN is attached to AR-1, packets between
   CN and <Net-1, MN> are routed using PATH-1 (via Cloud-1 and AR-1).
   Assume that at some time T+1, MN moves and then attaches to Net-2,
   which is reachable through AR-2 with the care-of address <Net-2, MN>.
   While MN is attached to AR-2, all packets between CN and <Net-2, MN>
   are routed using PATH-2 (through Cloud-2 and AR-2).


                      <---------PATH-1---------->

                       /---------\   +------+
                       |         |   |      | Net-1
                   +---+ Cloud-1 +---+ AR-1 +-----> MN (time=T)
                   |   |         |   |      |
                   |   \----+----/   +---+--+        |
                   |        |                        |
         CN <------+        | PATH-3                 |
                   |        |                        |
                   |   /----V----\   +-------+       V
                   |   |         |   |       |
                   +---+ Cloud-2 +---+ AR-2  +-----> MN (time=T+1)
                       |         |   |       | Net-2
                       \---------/   +-------+

                      <--------PATH-2----------->

   Figure 2: Mobility example.

   During a transitional disconnected period, MN may be disconnected
   from Net-1 and not yet attached to Net-2.  Consequently, AR-1 may not
   be able to deliver packets to MN.  This could result in a burst of
   packet losses.  There are several suggested means of supporting
   "fast" or "seamless" handovers, which involve adding machinery to the
   ARs to buffer and redirect packets originally sent to Net-1 towards
   Net-2, rather than dropping them (e.g., [KOODLI]).

   As long as MN remains in Net-1, standard congestion control
   algorithms [RFC2581] are sufficient.  But once it moves from Net-1 to
   Net-2, two different scenarios are possible depending on network


Schuetz, et al.         Expires December 2, 2006                [Page 7]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   topology:

   o  In the first scenario, with standard Mobile IPv4, all packets
      destined to <Net-1, MN> are dropped by AR-1 once the mobile node
      has moved.  Since the latency involved in establishing a new
      tunnel to the HA is on the order of the RTT (2*RTT in case of
      Mobile IPv6), roughly an entire window's worth of data and ACKs
      will be dropped by AR-1.  Because of this burst loss, the CN and
      MN are likely to incur expensive retransmission timeouts.

   o  In the second scenario, with a fast handover mechanism in place,
      losses are suppressed through buffering and tunneling between
      routers AR-1 and AR-2.  The exact means of buffering and
      forwarding between the ARs is not guaranteed to occur in a manner
      consistent to the available bandwidth of PATH-3, nor to conform to
      TCP's clocking expectations.  This can cause TCP's behavior over
      PATH-2 to be based on the unrelated properties of PATH-1 and
      PATH-3.

   After attaching to Net-2, reception of stale ACKs (for data sent on
   PATH-1) will cause MN to incorrectly inflate its congestion window.
   These stale ACKs do not provide any indication of the congestion
   along PATH-2 and should consequently be ignored .  CN's congestion
   window becomes similarly inflated by ACKs that MN sends for data
   segments redirected over PATH-3.  If the congestion windows from
   PATH-1 are already too big for PATH-2, this can overload Net-2 or
   PATH-2, causing packet loss and timeouts.

   On the other hand, if the available bandwidth along PATH-2 is greater
   than along PATH-1, and if the sender is in congestion avoidance, it
   will need potentially many RTTs before reaching a reasonable
   throughput.  This is due to relatively slow bandwidth increase during
   congestion avoidance caused by a stale SS_THRESH.  (See [ES05] for
   details.)

3.2.  Long Connectivity Disruptions

   For long disruptions, standard TCP performs slow-start after
   connectivity returns, because the retransmission timeout (RTO) has
   expired.  This is a conservative strategy that avoids overloading the
   new path.  However, TCP's general exponential back-off retransmission
   strategy can time these slow-starts such that performance decreases.

   When a long connectivity disruption occurs along the path between a
   host and its peer while the host is transmitting data, it stops
   receiving ACKs.  After the RTO expires, the host attempts to
   retransmit the first unacknowledged segment.  TCP implementations
   that follow the recommended RTO management proposed in [RFC2988]


Schuetz, et al.         Expires December 2, 2006                [Page 8]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   double the RTO after each retransmission attempt until it exceeds 60
   seconds.  This scheme causes a host to attempt to retransmit across
   established connections roughly once a minute.  (More frequently
   during the first minute or two of the connectivity disruption, while
   the RTO is still being backed off.)

   When the long connectivity disruption ends, standard TCP
   implementations still wait until the RTO expires before attempting
   retransmission.  Figure 3 illustrates this behavior.  Depending on
   when connectivity becomes available again, this can waste up to a
   minute of connection time for TCPs that implement the recommended RTO
   management described in [RFC2988].  For TCP implementations that do
   not implement [RFC2988], even longer connection times may be lost.
   For example, Linux uses 120 seconds as the maximum RTO by default.

          Sequence
          number      X = Successfully transmitted segment
           ^          O = Lost segment
           |     :                     :              : X
           |     :                     :              :X
           |     OO O  O    O        O :              X
           |    X:                     :              :
           |   X :                     :<------------>:
           |  X  :                     :    Wasted    :
           | X   :                     :  connection  :
           |X    :                     :     time     :
           +-----:---------------------:--------------:-------->
                 :                     :              :       Time
            Connectivity          Connectivity       TCP
               gone                  back         retransmit

   Figure 3: Standard TCP behavior in the presence of disrupted
   connectivity.

   This retransmission behavior is not efficient, especially in
   scenarios where connected periods are short and connectivity
   disruptions are frequent [DRIVE-THRU].  Experiments show that TCP
   performance across a path with frequent disruptions is significantly
   worse, compared to a similar path without disruptions [SCHUETZ-CCR].

   In the ideal case, TCP would attempt a retransmission as soon as
   connectivity to its peer was re-established.  Figure 4 illustrates
   the ideal behavior.


Schuetz, et al.         Expires December 2, 2006                [Page 9]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


          Sequence
          number      X = Successfully transmitted segment
           ^          O = Lost segment
           |     :                     : X            :
           |     :                     :X             :
           |     OO O  O    O        O X              :
           |    X:                     :              :
           |   X :                     :<------------>:
           |  X  :                     :  Efficiency  :
           | X   :                     :  improvement :
           |X    :                     :              :
           +-----:---------------------:--------------:-------->
                 :                     :              :       Time
            Connectivity          Connectivity      Next
               gone             back = immediate  scheduled
                                 TCP retransmit   retransmit

   Figure 4: Ideal TCP behavior in the presence of disrupted
   connectivity

   The ideal behavior is difficult to achieve for arbitrary connectivity
   disruptions.  One obviously problematic approach would use higher-
   frequency retransmission attempts to enable earlier detection of
   whether connectivity has returned.  This can generate significant
   amounts of extra traffic.  Other proposals attempt to trigger faster
   retransmissions by retransmitting buffered or newly-crafted segments
   from inside the network [SCOTT][I-D.dawkins-trigtran-
   linkup][DUKEHEND][RFC3819].

   Note that scenarios exist where path characteristics remain unchanged
   after long connectivity disruptions.  In this case, even an
   intelligently scheduled slow-start is inefficient, because TCP could
   safely resume transmitting at the old rate instead of slow-starting.
   Although originally developed to avoid line-rate bursts, techniques
   for the well-known "slow-start after idle" case [I-D.ietf-tcpimpl-
   restart] may be useful to further improve performance after a
   disruption ends.  This document does not currently describe this
   additional optimization.


4.  Connectivity-Change Indications

   The focus of this document is on specifying TCP response mechanisms
   to lower-layer "path characteristics have changed" indications.  This
   section briefly describes how different network- and shim-layer
   mechanisms underneath the transport layer can provide these
   "connectivity-change" indications to TCP.  This description is
   included for clarification only; the details of providing


Schuetz, et al.         Expires December 2, 2006               [Page 10]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   connectivity indications is out of scope of this document.

   Connectivity-change indications may be generated after lower layers
   detect a connectivity-change event, for example, because:

   o  the IP address of the outbound interface of a connection has
      changed, e.g., due to DHCP [RFC2131] or IPv6 router advertisements
      [RFC2460]

   o  link-layer connectivity at the outbound interface of a connection
      has changed, e.g., link-layer "link up" event

   o  the outbound interface of a connection has changed, due to routing
      changes or link-layer connectivity changes at other interfaces
      (including tunnel establishments or teardowns, e.g., in response
      to IKE events [RFC4306])

   o  a MobileIP binding update has completed [RFC3775]

   o  a HIP readdressing update has completed [I-D.ietf-hip-mm]

   o  a path-change signal from the network has arrived (possible in
      theory, depends on network capabilities)

   o  other notifications as defined by the IETF's Detecting Network
      Attachment (DNA) working group [I-D.ietf-dna-link-information]


5.  TCP Response to Connectivity-Change Indications

   A TCP connection can receive connectivity-change indications either
   from its local stack or through a new "connectivity-change TCP
   option" from its peer, as described in Section 5.1.  In either case,
   TCP implementations that implement the proposed changes re-probe path
   characteristics or perform a speculative retransmission, depending on
   whether the connection is currently stalled in exponential back-off
   or not.  A connection is "stalled in exponential back-off", if there
   is at least one unrecovered RTO, i.e. a segment was already
   retransmitted due to an RTO but still is not ACKed yet.

   TCP implementations that implement the proposed changes MUST maintain
   three new variables per connection: MY_CCI_COUNT, REMOTE_CCI_COUNT
   and CCI_STATE.  The variables MY_CCI_COUNT and REMOTE_CCI_COUNT count
   locally and remotely received connectivity-change indications,
   respectively.  The variable CCI_STATE stores the current state of the
   connectivity-change indication processing.  CCI_STATE can have one of
   the following values:


Schuetz, et al.         Expires December 2, 2006               [Page 11]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   o  CCI_IDLE: The host is currently not processing any connectivity-
      change indications.

   o  CCI_INITIATOR: The host is currently processing a connectivity-
      change indication received from the local stack and propagated the
      indication to its peer through a connectivity-change TCP option.

   o  CCI_RESPONDER: The host is currently processing a connectivity-
      change indication received from its peer via a connectivity-change
      TCP option.

   In the following, this document first introduces the operation of the
   new connectivity-change TCP option in Section 5.1, and afterwards
   describes the two mechanisms to improve TCP performance in response
   to connectivity-change events - namely re-probing path
   characteristics and speculative retransmission - in Section 5.2 and
   Section 5.3.

5.1.  Connectivity-Change Indication TCP Option

   Connectivity-change indications are generally asymmetric, i.e., they
   may occur on one peer host but not the other.  The basic idea behind
   the connectivity-change TCP option is to signal connectivity-change
   indications that the local stack has received to the peer, in order
   to allow it to respond appropriately.  Figure 5 shows the option.

   However, if there is strong evidence that a connectivity-change
   indication received from the local stack is symmetric, i.e., it
   occurs on both communicating peers, the host MAY decide not to signal
   the connectivity-change indication to the remote peer.  In this case,
   the signaling overhead can be avoided, because the remote peer will
   already react to the connectivity-change indication that it receives
   from its local stack.  For instance, when a HIP identifier becomes
   rebound to a new locator, both local and remote peers can be
   simultaneously notified about the connectivity-change by their local
   stacks, when the HIP UPDATE procedure completes [I-D.ietf-hip-mm].

                                            1     1      2      2
          0                8                6     8      1      4
          +----------------+----------------+-----+------+------+
          |      KIND      |     LENGTH     | RES | CNTR | ECNT |
          +----------------+----------------+-----+------+------+

   Figure 5: Format of the connectivity-change indication TCP option.

      KIND: (8 Bits) TCP Option Type.  Value set to 25 for experimental
      purposes.


Schuetz, et al.         Expires December 2, 2006               [Page 12]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


      LENGTH: (8 Bits) TCP Option Length.  Value = 3.

      RES: (2 Bits) Reserved bit.  Sender SHOULD set the value to zero.
      Receiver MUST ignore these fields.

      CNTR: (3 Bits) The local connectivity-change indication counter
      value of the host sending this option.  This value is decremented
      once for every connectivity-change indication that the local stack
      delivers to the connection.

      ECNT: (3 Bits) The echoed value of CNTR.  On reception of a
      connectivity-change indication TCP option, a host copies the
      received CNTR value to the ECNT field of its response.

   The connectivity-change TCP option contains a counter (CNTR) that
   represents the number of times each side has received connectivity-
   change indications from its local stack.  At the beginning of a
   connection, both endpoints use this option in the SYN and SYN-ACK
   segments, with an initial counter value of 7, to advertise support
   for the option.  A host MUST NOT place this option in a SYN-ACK
   unless it was present on the received SYN.  After the SYN exchange,
   hosts SHOULD NOT send this option until there is a connectivity-
   change indication.  After connection setup, the option is only
   generated when a connection receives a connectivity-change indication
   from its local stack, or in response to a received connectivity-
   change TCP option from the peer.  A host MUST NOT send the option
   during a connection unless it was advertised by both sides during the
   SYN handshake.

   When a host receives a connectivity-change TCP option, it SHOULD
   respond to it as described in Section 5.2 and Section 5.3 only if
   CNTR != REMOTE_CCI_COUNT, i.e. the peer signals a new instance of a
   connectivity-change that it has not previously signaled.  The host
   SHOULD NOT respond to the reception of a connectivity-change TCP
   option if CNTR = REMOTE_CCI_COUNT, because the option duplicates a
   previous connectivity-change indication.

   At the beginning of a connection, CCI_STATE MUST be set to CCI_IDLE.
   The option SHOULD be included in all outgoing ACKs or segments if
   CCI_STATE != CCI_IDLE and SHOULD NOT be included in any outgoing ACK
   or segment if CCI_STATE = CCI_IDLE.

   When sending the connectivity-change TCP option, CNTR MUST be set to
   current MY_CCI_COUNT and ECNT MUST be set to current
   REMOTE_CCI_COUNT.

   When a connection receives a connectivity-change indication from its
   local stack and decides to signal the local indication to the remote


Schuetz, et al.         Expires December 2, 2006               [Page 13]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   peer, it decrements its MY_CCI_COUNTER, sets CCI_STATE to
   CCI_INITIATOR and consequently sends a connectivity-change TCP option
   in every subsequent ACK or data segment until CCI_STATE = CCI_IDLE.
   It resets CCI_STATE from CCI_INITIATOR to CCI_IDLE when it sees its
   current MY_CCI_COUNTER value echoed back as ECNT in a connectivity-
   change TCP option received from its peer.

   NOTE: As discussed before, a host may under certain circumstances
   decide not to signal a local connectivity-change indication to the
   remote peer.  In this case, MY_CCI_COUNTER and CCI_STATE MUST NOT be
   altered.

   When a host receives a connectivity-change TCP option from its peer,
   it compares the received CNTR and the local REMOTE_CCI_COUNT.  If
   they match, no further action is required.  Otherwise, it MUST update
   REMOTE_CCI_COUNT to CNTR.  It also MUST update CCI_STATE to
   CCI_RESPONDER unless

   o  CCI_STATE is CCI_INITIATOR and

   o  it has the higher initial sequence number of the two communicating
      hosts.

   CCI_STATE is reset from CCI_RESPONDER to CCI_IDLE when a host
   receives an ACK or segment from its peer that does not contain the
   connectivity-change TCP option.

   NOTE: The transition from CCI_STATE CCI_INITIATOR to CCI_RESPONDER is
   only allowed if the host has the lower initial sequence number.  This
   is to prevent an infinite signaling loop where both hosts are in the
   CCI_RESPONDER state.  Otherwise, if the two peers simultaneously
   receive connectivity-change indications from their local stacks and
   send out connectivity-change TCP options, both peers would set
   CCI_STATE to CCI_RESPONDER and include the option in all subsequent
   ACKs and segments.  Therefore, none of the peers will reset CCI_STATE
   from CCI_RESPONDER to CCI_IDLE, as this transition is only performed
   when a host receives an ACK or segment that does not contain the
   connectivity-change TCP option.

5.2.  Re-Probing Path Characteristics

   When a TCP connection receives a connectivity-change indication and
   is not currently stalled, it MUST re-probe the path characteristics
   to prevent causing congestion along the potentially new path and to
   quickly probe the path's available capacity.  In principle, this
   occurs similar to the initial slow-start: The sender MUST NOT
   transmit more than the default initial window of data along the new
   path, in order to avoid over-congesting it, and the slow-start


Schuetz, et al.         Expires December 2, 2006               [Page 14]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   threshold (SS_THRESH) SHOULD be set to the initial value as with a
   new connection to allow for rapid probing of available capacity.  In
   addition, it MUST reset round-trip time measurement (RTTM) and the
   RTO timer.  In case Path MTU Discovery (PMTUD) is activated, PMTUD
   state SHOULD also be reset [RFC1191][RFC1981].

   One difference to slow-start is that after a connectivity-change
   indication, the connection may have segments in flight towards the
   destination along a previous path.  Therefore, after a connectivity-
   change indication, congestion control MUST ignore any stale ACKs and
   MUST update the congestion window solely based on ACKs for data sent
   on the new path.

   In detail, when a connectivity-change indication is received, it MAY
   send INIT_WINDOW worth of data along the changed path and MUST reset
   the congestion control state, RTTM state, and RTO timer as if this
   were a new connection [RFC2581][RFC2988].  Each ACK that is received
   while CCI_STATE is not CCI_IDLE SHOULD be treated as a stale ACK.

   For each stale ACK received, a host MUST NOT adjust the congestion
   window and MUST NOT send any new data into the network.  This
   behavior SHOULD continue until CCI_STATE is CCI_IDLE again or there
   is a timeout.  Once CCI_STATE is set to CCI_IDLE, the sender should
   consider any un-ACK'ed segments below the highest received ACK as
   lost and discount them from the segments in flight.  The sender MUST
   use slow-start based loss recovery for these segments.

5.3.  Speculative Retransmission

   The basic idea behind the speculative retransmission is to allow TCP
   to resume stalled connections as soon as it receives an indication
   that connectivity to previously unreachable peers may have returned.

   When a TCP connection receives a connectivity-change indication -
   either from the local stack or in a connectivity-change TCP option
   from the peer - and is currently stalled, it MUST immediately
   initiate the standard retransmission procedure, just as if the RTO
   for the connection had expired.

   In addition, conforming TCP implementations SHOULD send at least one
   segment to the peer.  This segment MUST contain the connectivity-
   change TCP option to notify the peer and may either be a queued data
   retransmission or a pure ACK, if the connection has no data awaiting
   retransmission.


6.  Security Considerations


Schuetz, et al.         Expires December 2, 2006               [Page 15]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   The only foreseen security considerations with the techniques
   presented in this document, result from either an attacker's ability
   to spoof valid TCP segments with options that seemingly indicate
   connectivity changes, or an attacker's ability to generate bogus
   connectivity change indications locally.  An attacker might produce a
   stream of such false indicators that could keep a connection in slow-
   start at the initial window.  One possible defense against this type
   of attack is to rate-limit the response to connectivity indicators
   (whether local or remote).  This is also probably less serious than
   other attacks such an empowered adversary could perform, like
   reseting the connection or injecting data.  A similar effect could be
   achieved without the new option by forging duplicate ACKs that would
   keep a sender in loss recovery.  If both sets of IP addresses, port
   numbers, and sequence numbers are guessable for a connection, then
   the connection should use an approved means (such as IPsec)
   [I-D.ietf-tcpm-tcp-antispoof] for protection against spoofed
   segments.


7.  Conclusion

   When connectivity characteristics between two hosts change abruptly,
   TCP can experience significant delays before resuming transmission in
   an efficient manner or TCP can behave unfairly to competing traffic.

   This document describes TCP extensions that improve transmission
   behavior in response to advisory, lower-layer connectivity-change
   indications.  The proposed TCP extensions modify the local behavior
   of TCP and introduce a new TCP option to signal local connectivity-
   change indications to remote peers.


8.  IANA Considerations

   This section is to be interpreted according to [RFC2434].

   This document does not define any new namespaces.  It uses an 8-bit
   TCP option number maintained by IANA at
   http://www.iana.org/assignments/tcp-parameters.


9.  Acknowledgments

   This draft combines and obsoletes [I-D.swami-tcp-lmdr] and
   [I-D.eggert-tcpm-tcp-retransmit-now].  The authors would like to
   thank Mark Allman, Marcus Brunner, Shashikant Maheshwari, Kacheong
   Poon, Juergen Quittek, Stefan Schmid and Joe Touch for their comments
   and suggestions on the two previous drafts.


Schuetz, et al.         Expires December 2, 2006               [Page 16]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   Lars Eggert and Simon Schuetz are partly funded by Ambient Networks,
   a research project supported by the European Commission under its
   Sixth Framework Program.  The views and conclusions contained herein
   are those of the authors and should not be interpreted as necessarily
   representing the official policies or endorsements, either expressed
   or implied, of the Ambient Networks project or the European
   Commission.

   Wesley Eddy's work on this document was performed at NASA's Glenn
   Research Center, while in support of the NASA Space Communications
   Architecture Working Group (SCAWG), and the FAA/Eurocontrol Future
   Communications Study (FCS).


10.  References

10.1.  Normative References

   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
              RFC 793, September 1981.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2434]  Narten, T. and H. Alvestrand, "Guidelines for Writing an
              IANA Considerations Section in RFCs", BCP 26, RFC 2434,
              October 1998.

   [RFC2581]  Allman, M., Paxson, V., and W. Stevens, "TCP Congestion
              Control", RFC 2581, April 1999.

   [RFC2988]  Paxson, V. and M. Allman, "Computing TCP's Retransmission
              Timer", RFC 2988, November 2000.

10.2.  Informative References

   [DRIVE-THRU]
              Ott, J. and D. Kutscher, "Drive-Thru Internet: IEEE
              802.11b for Automobile Users", Proc. Infocom 2004,
              March 2004.

   [DUKEHEND]
              Duke, M., Henderson, T., and J. Meegan, "Experience with
              ``Link-UP Notification'' Over a Mobile Satellite Link",
              ACM Computer Communication Review, Vol. 34, No. 3,
              July 2004.

   [ES05]     Eddy, W. and Y. Swami, "Adapting End-host Congestion


Schuetz, et al.         Expires December 2, 2006               [Page 17]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


              Control for Mobility", NASA Glenn Research Center
              Technical Report, CR-2005-213838, July 2005.

   [I-D.dawkins-trigtran-linkup]
              Dawkins, S., "End-to-end, Implicit 'Link-Up'
              Notification", draft-dawkins-trigtran-linkup-01 (work in
              progress), October 2003.

   [I-D.eggert-tcpm-tcp-retransmit-now]
              Eggert, L., "TCP Extensions for Immediate
              Retransmissions", draft-eggert-tcpm-tcp-retransmit-now-02
              (work in progress), June 2005.

   [I-D.iab-link-indications]
              Aboba, B., "Architectural Implications of Link
              Indications", draft-iab-link-indications-04 (work in
              progress), December 2005.

   [I-D.ietf-dna-link-information]
              Yegin, A., "Link-layer Event Notifications for Detecting
              Network Attachments", draft-ietf-dna-link-information-03
              (work in progress), October 2005.

   [I-D.ietf-hip-mm]
              Nikander, P., "End-Host Mobility and Multihoming with the
              Host Identity Protocol", draft-ietf-hip-mm-03 (work in
              progress), March 2006.

   [I-D.ietf-tcpimpl-restart]
              Hughes, A., Touch, J., and J. Heidemann, "Issues in TCP
              Slow-Start Restart After Idle",
              draft-ietf-tcpimpl-restart-00 (work in progress),
              March 1998.

   [I-D.ietf-tcpm-tcp-antispoof]
              Touch, J., "Defending TCP Against Spoofing Attacks",
              draft-ietf-tcpm-tcp-antispoof-03 (work in progress),
              February 2006.

   [I-D.ietf-tcpm-tcp-uto]
              Eggert, L. and F. Gont, "TCP User Timeout Option",
              draft-ietf-tcpm-tcp-uto-02 (work in progress),
              October 2005.

   [I-D.ietf-tsvwg-quickstart]
              Floyd, S., "Quick-Start for TCP and IP",
              draft-ietf-tsvwg-quickstart-02 (work in progress),
              March 2006.


Schuetz, et al.         Expires December 2, 2006               [Page 18]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   [I-D.swami-tcp-lmdr]
              Swami, Y., "Lightweight Mobility Detection and Response
              (LMDR) Algorithm for TCP", draft-swami-tcp-lmdr-07 (work
              in progress), March 2006.

   [KOODLI]   Koodli, R. and C. Perkins, "Fast Handovers and Context
              Transfers in Mobile Networks", ACM Computer Communication
              Review, Vol. 31, No. 5, October 2001.

   [RFC1122]  Braden, R., "Requirements for Internet Hosts -
              Communication Layers", STD 3, RFC 1122, October 1989.

   [RFC1191]  Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
              November 1990.

   [RFC1981]  McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery
              for IP version 6", RFC 1981, August 1996.

   [RFC2131]  Droms, R., "Dynamic Host Configuration Protocol",
              RFC 2131, March 1997.

   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
              (IPv6) Specification", RFC 2460, December 1998.

   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
              of Explicit Congestion Notification (ECN) to IP",
              RFC 3168, September 2001.

   [RFC3344]  Perkins, C., "IP Mobility Support for IPv4", RFC 3344,
              August 2002.

   [RFC3775]  Johnson, D., Perkins, C., and J. Arkko, "Mobility Support
              in IPv6", RFC 3775, June 2004.

   [RFC3819]  Karn, P., Bormann, C., Fairhurst, G., Grossman, D.,
              Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L.
              Wood, "Advice for Internet Subnetwork Designers", BCP 89,
              RFC 3819, July 2004.

   [RFC4306]  Kaufman, C., "Internet Key Exchange (IKEv2) Protocol",
              RFC 4306, December 2005.

   [SCHUETZ-CCR]
              Schuetz, S., Eggert, L., Schmid, S., and M. Brunner,
              "Protocol Enhancements for Intermittently Connected
              Hosts", ACM Computer Communication Review, Vol. 35, No. 3,
              July 2005.


Schuetz, et al.         Expires December 2, 2006               [Page 19]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   [SCOTT]    Scott, J. and G. Mapp, "Link layer-based TCP optimisation
              for disconnecting networks", ACM Computer Communication
              Review, Vol. 33, No. 5, October 2003.

Editorial Comments

   [anchor3]  LE: The authors have seen the idea of triggering
              retransmits based on connectivity events of directly-
              connected links attributed to Phil Karn ("kick" operation
              in the KAQ9 TCP stack). Pointers to a citable reference
              are highly appreciated!


Appendix A.  Document Revision History

   +----------+--------------------------------------------------------+
   | Revision | Comments                                               |
   +----------+--------------------------------------------------------+
   | 00       | Initial version. This document is a merge of and       |
   |          | obsoletes [I-D.eggert-tcpm-tcp-retransmit-now] and     |
   |          | [I-D.swami-tcp-lmdr].                                  |
   +----------+--------------------------------------------------------+


Schuetz, et al.         Expires December 2, 2006               [Page 20]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


Authors' Addresses

   Simon Schuetz
   NEC Network Laboratories
   Kurfuerstenanlage 36
   Heidelberg  69115
   Germany

   Phone: +49 6221 4342 165
   Fax:   +49 6221 4342 155
   Email: simon.schuetz@netlab.nec.de
   URI:   http://www.netlab.nec.de/


   Lars Eggert
   NEC Network Laboratories
   Kurfuerstenanlage 36
   Heidelberg  69115
   Germany

   Phone: +49 6221 4342 143
   Fax:   +49 6221 4342 155
   Email: lars.eggert@netlab.nec.de
   URI:   http://www.netlab.nec.de/


   Wesley M. Eddy
   Verizon Federal Network Systems
   NASA Glenn Research Center
   21000 Brookpark Road, MS 54-5
   Cleveland, OH  44135
   USA

   Email: weddy@grc.nasa.gov


   Yogesh Prem Swami
   Nokia Research Center, Dallas
   6000 Connection Drive
   Irving, TX  75603
   USA

   Phone: +1 972 374 0669
   Email: yogesh.swami@nokia.com


Schuetz, et al.         Expires December 2, 2006               [Page 21]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


   Khiem Le
   Nokia Research Center, Dallas
   6000 Connection Drive
   Irving, TX  75603
   USA

   Phone: +1 972 894 4882
   Email: khiem.le@nokia.com


Schuetz, et al.         Expires December 2, 2006               [Page 22]

Internet-Draft  TCP Response to Connectivity Indications        May 2006


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Copyright Statement

   Copyright (C) The Internet Society (2006).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.


Schuetz, et al.         Expires December 2, 2006               [Page 23]