Internet DRAFT - draft-heavens-problems-rsts

draft-heavens-problems-rsts



HTTP/1.1 200 OK
Date: Tue, 09 Apr 2002 00:20:00 GMT
Server: Apache/1.3.20 (Unix)
Last-Modified: Thu, 13 Jun 1996 22:22:00 GMT
ETag: "2e9af8-aef0-31c09488"
Accept-Ranges: bytes
Content-Length: 44784
Connection: close
Content-Type: text/plain


Internet Draft                                          Ian Heavens 
Expires December 15, 1996                               Fore Systems
                                                        June 1996


                        RSTs Considered Harmful
                   draft-heavens-problems-rsts-02.txt


Status of this Memo

   This memo is being distributed to members of the Internet community
   in order to solicit their reactions to the proposals contained in it.

   This document is an Internet-Draft.  Internet-Drafts are working do-
   cuments of the Internet Engineering Task Force (IETF), its areas, and
   its working groups.  Note that other groups may also distribute work-
   ing documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet- Drafts as reference ma-
   terial or to cite them other than as ``work in progress.''

   To learn the current status of any Internet-Draft, please check the
   ``1id-abstracts.txt'' listing contained in the Internet- Drafts Sha-
   dow Directories on ds.internic.net (US East Coast), nic.nordu.net
   (Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific
   Rim).

Abstract

   This memo argues that the danger of segments from old TCP connections
   occurs for connections terminated by RST segments, as well as those
   terminated by exchange of FIN segments. In addition, TIME-WAIT state
   alone does not provide complete protection.  The likelihood of data
   corruption is significant, in that it exceeds the probability of
   corruption after FIN exchange for which TIME-WAIT state was designed.













Heavens                                                         [Page 1]

Internet Draft          RSTs Considered Harmful                June 1996



           Table of Contents

   1.  Introduction

           1.1 Overview
           1.2 Background
           1.3 RST-Terminated Connections

   2.  Old Segment Acceptance from RST-Terminated Connections

           2.1 RST-Terminated Connections from Established State
           2.2 RST-Terminated Connections during Closedown
           2.3 Proof by Demonstration
           2.4 Other Hazards
           2.5 Relative Probabilities

   3.  TIME-WAIT after RST Transmission

           3.1 User Abort with TIME-WAIT
           3.2 RST Loss and Data Retransmission
           3.3 RST Loss and Idle Connections

   Appendix A: A Different Interpretation of RFC-1122
   Appendix B: Relative Probabilities of Hazards
   Appendix C: Traffic Statistics for TCP Connections

























Heavens                                                         [Page 2]

Internet Draft          RSTs Considered Harmful                June 1996


   Glossary

      o  FIN-Terminated Connection

         A synchronised TCP connection which terminates by the 3-way
         handshake, involving the exchange and reliable acknowledgement
         of FIN segments.

      o  RST-Terminated Connection

         A synchronised TCP connection which terminates by transmission
         or reception of a RST.

      o  MSL

         Maximum Segment Lifetime



































Heavens                                                         [Page 3]

Internet Draft          RSTs Considered Harmful                June 1996


1. Introduction

1.1 Overview

   Chapter 1 describes mechanisms for closing TCP connections, and the
   significance of the TIME-WAIT state.

   Chapter 2 identifies a series of connection terminations involving
   RSTs that may lead to data corruption.

   Chapter 3 shows how the use of TIME-WAIT state alone can provide some
   protection against this and identifies scenarios where this solution
   is insufficient.

1.2 Background

   FINs, RSTs, Timers and ICMP Messages

   There are four mechanisms available in [RFC-793] to close a TCP con-
   nection: FINs, RSTs, Timeouts and ICMP messages.

   FINs may be used to close down a connection in an orderly fashion,
   guaranteeing reliable delivery of all data segments transmitted
   before the FIN in each direction.  The requirement to reliably ack-
   nowledge FINs in both directions leads to a number of half-closed
   states:  FIN-WAIT-1, FIN-WAIT-2, CLOSING, CLOSE-WAIT, LAST-ACK and
   TIME-WAIT.

   A RST closes a connection abruptly, immediately removing connection
   state on transmission or reception.  There are no interim states;
   transition is to CLOSED on transmission or reception of a RST.

   Timeouts also close a connection abruptly;  a connection that times
   out optionally transmits a RST, or it may assume that the peer has
   disappeared. Timeouts also cause an immediate transition to CLOSED.

   ICMP messages do not usually terminate a synchronised connection, but
   it is possible. In the same way as connections terminated by RST or
   timeout, there is an immediate state transition to CLOSED.

   This memo restricts its attention to connections closed by FINs and
   RSTs.

   TIME-WAIT

   The TIME-WAIT state has two functions in the TCP protocol. The first
   is asymmetric: to ensure the reliable acknowledgement of FINs
   transmitted in CLOSE-WAIT state and so the completion of the 3-way



Heavens                                                         [Page 4]

Internet Draft          RSTs Considered Harmful                June 1996


   closing handshake.  The second is symmetric: to ensure that all TCP
   segments, generated in either direction during the lifetime of the
   connection, have drained from the network before initiation of a new
   incarnation of the connection. The clock based ISN protects slow con-
   nections against this threat [RFC-793]. For fast connections, this is
   no longer true.  In this case, TIME-WAIT prevents the acceptance of
   old duplicate segments by a new incarnation utilising identical port
   numbers. The relative threats are explained in the Appendix of [RFC-
   1185], and in section 1.2 of [RFC-1323].  The problem is summarised
   in relation to the danger of premature termination of TIME-WAIT state
   by RST reception (TIME-WAIT assassination) in [RFC-1337].

   No equivalent mechanism to TIME-WAIT exists for connections ter-
   minated by transmission of a RST segment.  Although RST transmission
   is omitted from the TCP Connection State Diagram, the text of [RFC-
   793] clearly states that where the transmission of a RST results in a
   state change, it is to CLOSED state.  Similarly, reception of a RST
   causes a state change to CLOSED.

1.3 RST-Terminated Connections

   There are several ways in which previously synchronised connections
   are terminated by RST transmission.  These include User Abort [RFC-
   793] and reception of data after half-duplex close [RFC-1122].  How-
   ever, not all RSTs result in connection termination.  Reception of a
   SYN segment addressed to a port for which there is no listening
   socket results in transmission of a RST.  This is associated with no
   connection and is equivalent to an ICMP Port Unreachable.  The origi-
   nator of the SYN changes state from SYN-SENT to CLOSED on reception
   of the RST, and the connection is never synchronised. Other connec-
   tions in non-synchronised states respond to an unacceptable ACK,
   security or precedence mismatch by transmitting a RST.  In all these
   cases, no connection has been synchronised nor data sent, so that
   there is no danger of old data segments being accepted by subsequent
   incarnations of the connection.

   This memo distinguishes those synchronised connections which ter-
   minate by transmission or reception of a RST by referring to them as
   "RST-terminated connections".












Heavens                                                         [Page 5]

Internet Draft          RSTs Considered Harmful                June 1996


2. Old Segment Acceptance from RST-Terminated Connections

   Several scenarios result in the spurious acceptance of old segments
   from RST-terminated connections.  Two types of examples are given
   here: connections aborted in Established state, and connections
   aborted during the 3-way closing handshake.

2.1 RST-Terminated Connections from Established State

   There are two instances of RST-terminated connections from Esta-
   blished state which involve the hazard of old data acceptance by a
   subsequent incarnation of the connection.

   The first is a User Abort issued in Established state; the second a
   half-duplex close with unread data [RFC-1122, p.88].  The sequence of
   events in both case is identical: a RST is sent by the socket from
   Established state, as a result of an abort, or a close with pending
   unread data.

   In the worst failure mode, the socket issuing the abort is acting as
   a data sink.   In this case a window of data segments may be in tran-
   sit when the RST is received at the data source.  Any of these seg-
   ments - which are not duplicates - may corrupt a subsequent incarna-
   tion of the connection.

              TCP A                                                 TCP B

     1. ESTABL.  --> <SEQ=400><ACK=101><DATA=80><CTL=ACK> -->     ESTABL.

     2.              ...  <SEQ=101><ACK=480><CTL=ACK>       <--   ESTABL.

                                                                 (User Abort)
     3.                       ...  <SEQ=101><CTL=RST>        <--   CLOSED

     4. ESTABL.  --> <SEQ=480><ACK=101><DATA=80><CTL=ACK> ...

     5. ESTABL.  <-- <SEQ=101><ACK=480><CTL=ACK> ...

     6. ESTABL.  --> <SEQ=560><ACK=101><DATA=80><CTL=ACK> ...

     7. ESTABL.  --> <SEQ=640><ACK=101><DATA=80><CTL=ACK> ...

     8. CLOSED   <--  <SEQ=101><CTL=RST> ...

                  Figure 1.  Connection closed by User Abort

   This is shown in Figure 1. TCP A is the data source and TCP B is the
   data sink.  Line 1 shows a normal data segment from TCP A.  An ACK



Heavens                                                         [Page 6]

Internet Draft          RSTs Considered Harmful                June 1996


   segment is transmitted by TCP B on line 2.  TCP B user issues an
   abort, transmits a RST, and enters CLOSED state on line 3, as speci-
   fied in [RFC-793].  Normal data continues to be transmitted by TCP A
   on line 4.  Line 5 shows the arrival at TCP A of the ACK generated on
   line 2.  This may open the window and elicit further segments from
   TCP A on lines 6 and 7, until the arrival of the RST at TCP A on line
   8.  At this point TCP A enters CLOSED state, and three data segments
   from TCP A are in transit to TCP B.

   The connection is reopened by the 3-way SYN handshake.  Assume that
   the clock based ISN chosen by TCP A for the new connection has been
   overrun by the sequence number consumption in the previous incarna-
   tion of the connection. The sequence numbers occupied by the last
   three segments transmitted by TCP A during the previous incarnation
   may overlap the window offered by TCP B in the current incarnation of
   the connection.

              TCP A                                                 TCP B

     1. ESTABL.  --> <SEQ=400><ACK=101><DATA=100><CTL=ACK>    -->  ESTABL.

     2. ESTABL.  <--     <SEQ=101><ACK=500><CTL=ACK>       <--     ESTABL.

     3.  (old segment)...<SEQ=560><ACK=101><DATA=80><CTL=ACK> -->  ESTABL.

     4. ESTABL.  <--      <SEQ=101><ACK=500><CTL=ACK>      <--     ESTABL.

     5. ESTABL.  --> <SEQ=500><ACK=101><DATA=100><CTL=ACK>    -->  ESTABL.

     6.             ...  <SEQ=101><ACK=640><CTL=ACK>       <--     ESTABL.

    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

     7a. ESTABL. --> <SEQ=600><ACK=101><DATA=100><CTL=ACK> -->     ESTABL.

     8a. ESTABL.  <--    <SEQ=101><ACK=640><CTL=ACK> ...

     9a. ESTABL. --> <SEQ=700><ACK=101><DATA=100><CTL=ACK> -->     ESTABL.

     10a ESTABL.  <--     <SEQ=101><ACK=800><CTL=ACK>      <--     ESTABL.

             Figure 2: Accepting One Old Segment

   Figure 2 shows the spurious acceptance of part of a segment from the
   previous incarnation of the connection.  Line 1 shows a normal data
   segment from TCP A after the SYN handshake has been completed.  Line
   2 shows the ACK of this segment, and line 3 shows the arrival of an
   old segment from the previous connection.  It falls within TCP B's



Heavens                                                         [Page 7]

Internet Draft          RSTs Considered Harmful                June 1996


   current window and is queued in the TCP reassembly queue, as its
   sequence number exceeds the next expected sequence number.  Since
   there is a missing segment, the next ACK in line 4 acknowledges the
   previous bona fide segment, and TCP A does not detect acknowledgement
   of unsent data.  The next segment from the current connection arrives
   at TCP B in line 5. At this point, part or all of the old segment is
   delivered to the user of TCP B, depending upon the implementation of
   the reassembly algorithm.  This behaviour is described in [RFC-1337].

   TCP B transmits the acknowledgement of the two previous segments in
   line 6. TCP A transmits another segment on line 7a before the arrival
   of the acknowledgement in line 8a, and assumes that it is a partial
   acknowledgement of this segment.  Segment transmission and ack-
   nowledgement continue as usual on lines 9a and 10a.  Neither TCP A
   nor TCP B are aware of the spurious acceptance of old data by TCP B.

   To underscore the possibility of the erroneous acceptance of several
   old segments, Figure 3 shows the acceptance of two such segments.
   The exchange is identical to Figure 2 until 7a, when a second old
   segment from TCP A arrives at TCP B.  Since TCP B has queued the
   first old segment from TCP A, it delivers the entire second old seg-
   ment to the user.  TCP B transmits the acknowledgement on line 7b.
   Line 8a and subsequent lines show the arrival of the acknowledgements
   of spurious segments and the transmission of further segments by TCP
   A.  The acknowledgements are accepted as valid, since TCP A has
   already transmitted past the sequence number acknowledged in the last
   ACK from TCP B.
























Heavens                                                         [Page 8]

Internet Draft          RSTs Considered Harmful                June 1996



              TCP A                                                 TCP B

     1. ESTABL.  --> <SEQ=400><ACK=101><DATA=100><CTL=ACK>    -->  ESTABL.

     2. ESTABL.  <--     <SEQ=101><ACK=500><CTL=ACK>       <--     ESTABL.

     3.  (old segment)...<SEQ=560><ACK=101><DATA=80><CTL=ACK> -->  ESTABL.

     4. ESTABL.  <--      <SEQ=101><ACK=500><CTL=ACK>      <--     ESTABL.

     5. ESTABL.  --> <SEQ=500><ACK=101><DATA=100><CTL=ACK>    -->  ESTABL.

     6.             ...  <SEQ=101><ACK=640><CTL=ACK>       <--     ESTABL.

    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

     7a.  (old segment)...<SEQ=640><ACK=101><DATA=80><CTL=ACK> --> ESTABL.

     7b.             ...  <SEQ=101><ACK=720><CTL=ACK>       <--    ESTABL.

     7c. ESTABL. --> <SEQ=600><ACK=101><DATA=100><CTL=ACK>    -->  ESTABL.

     7d.             ...  <SEQ=101><ACK=720><CTL=ACK>       <--    ESTABL.

     8a. ESTABL.  <--    <SEQ=101><ACK=640><CTL=ACK> ...

     9a. ESTABL. --> <SEQ=700><ACK=101><DATA=100><CTL=ACK>    -->  ESTABL.

     9b. ESTABL.  <--    <SEQ=101><ACK=720><CTL=ACK> ...

     9c. ESTABL.  <--    <SEQ=101><ACK=720><CTL=ACK> ...

     10a ESTABL.  <--     <SEQ=101><ACK=800><CTL=ACK>      <--     ESTABL.

             Figure 3: Accepting Two Old Segments

   These examples may be generalised to illustrate the arrival and
   acceptance of a window of old segments at TCP B.

   It is also possible for old segments to persist in the case where a
   user abort is issued on the socket acting as a data source.  This
   happens when the ensuing RST arrives before one or more of the data
   segments previously transmitted.  This is shown in Figure 4.







Heavens                                                         [Page 9]

Internet Draft          RSTs Considered Harmful                June 1996



              TCP A                                                 TCP B

     1. ESTABL.  --> <SEQ=400><ACK=101><DATA=80><CTL=ACK> -->     ESTABL.

     2. ESTABL.  <--     <SEQ=101><ACK=480><CTL=ACK>       <--    ESTABL.

     3. ESTABL.  --> <SEQ=480><ACK=101><DATA=80><CTL=ACK> ...

     4. ESTABL.  --> <SEQ=560><ACK=101><DATA=80><CTL=ACK> ...

     5. ESTABL.  --> <SEQ=640><ACK=101><DATA=80><CTL=ACK> ...

     (User Abort)
     6. CLOSED   --> <SEQ=101><CTL=RST>

     7.                       ...  <SEQ=101><CTL=RST>        -->   CLOSED

     8.                <SEQ=480><ACK=101><DATA=80><CTL=ACK>  -->

     9.                <SEQ=560><ACK=101><DATA=80><CTL=ACK>  -->

     10.               <SEQ=640><ACK=101><DATA=80><CTL=ACK>  -->

                  Figure 4.  User Abort and RST Reordering

   The acceptance of old segments in transit on lines 8, 9 and 10 occurs
   in an identical fashion to the previous example, as shown in Figures
   2 and 3.

2.2 RST-Terminated Connections during Closedown

   RST-terminated connections also occur from states other than Esta-
   blished, during the 3-way closing handshake.  Two examples are User
   Abort [RFC-793] and Half Duplex Close [RFC-1122].

   User Abort during Closedown

   A user abort issued in FIN-WAIT-1, FIN-WAIT-2, CLOSING or CLOSE-WAIT
   states results in the transmission of a RST, and the socket enters
   CLOSED state [RFC-793].  The consequences of user abort in FIN-WAIT-
   1, FIN-WAIT-2 and CLOSW-WAIT are similar to the previous section; an
   entire window may be in transit when the RST is transmitted, if there
   is data in transfer in the opposite direction to that folllowed by
   the FIN.  In CLOSING state, the FIN, and all data segments, have been
   received by the peer before it transmits the RST, and no non-
   duplicate data segments are in the network.  In this case the danger
   reduces to that of old duplicate segments, as in a conventionally



Heavens                                                        [Page 10]

Internet Draft          RSTs Considered Harmful                June 1996


   closed TCP connection.

   Data received after Half Duplex Close

   A host may implement a half-duplex TCP close, where an application
   that has called CLOSE cannot continue to read data from the connec-
   tion [RFC-1122].  Subsequent arrival of data elicits a RST. RFC-1122
   does not explicitly state whether the connection enters CLOSED state.
   In this section the assumption is made that it does.  Appendix A
   shows the results if this assumption is invalid.  The danger of
   acceptance of old segments still exists in the latter case.

   It is straightforward to demonstrate this scenario.  Berkeley UNIX
   implementations of FTP [RFC-959] abort transfers in this fashion when
   the receiver cannot write out the file to disk, because the disk is
   full or because the file is too large.  Figure 5 shows this scenario.
   TCP A is a 80386 running Interactive UNIX with SpiderTCP, and TCP B
   is a Sparcstation running SunOS 4.1.3.  An FTP client is started from
   TCP A and the 'get' command used to download a file from TCP B.  TCP
   A aborts the connection because the file limit is reached.  The FTP
   control connection is closed first and then the data connection.
   Further data arrives from TCP B.  Since this arrives in FIN-WAIT-2,
   and BSD TCP/IP implements half duplex close, it elicits a RST from
   TCP A [RFC-1122], and TIME-WAIT state is bypassed.  Note that figure
   5 shows only the FTP data connection, not the control connection.

          TCP A                                                TCP B

      1. ESTABL.  <-- <SEQ=220><ACK=100><DATA=80><CTL=ACK> <--  ESTABL.

      2. ESTABL.  -->     <SEQ=100><ACK=300><CTL=ACK>  -->      ESTABL.

          (File Too Large: Close)
      3.  FIN-WAIT-1  --> <SEQ=100><ACK=300><CTL=FIN,ACK>  --> CLOSE-WAIT

      4.  FIN-WAIT-2  <-- <SEQ=300><ACK=101><CTL=ACK>      <-- CLOSE-WAIT

      5.  FIN-WAIT-2  <-- <SEQ=300><ACK=101><DATA=80>      <-- CLOSE-WAIT

      6.  CLOSED      --> <SEQ=101><CTL=RST>               --> CLOSED

                  Figure 5.  Data Received after Half Duplex Close

   If the ACK in line 4 is delayed or lost, TCP A is still in FIN-WAIT-1
   in line 5, when the data arrives.  A RST is transmitted and there is
   a state transition to CLOSED, as above.  For both these scenarions,
   the danger of acceptance by a subsequent incarnation of the connec-
   tion occurs in identical fashion to Figure 2.



Heavens                                                        [Page 11]

Internet Draft          RSTs Considered Harmful                June 1996


2.3 Proof by Demonstration

   The hazards described in this memo could be shown with the testbed
   used to demonstrate the hazards of TIME-WAIT assassination in [RFC-
   1337].  This might involve a client application acting as a data
   source, and a server which, on receipt of the first data segment,
   transmits a RST and closes the connection.   Repetition of this over
   a long period should cause the server to accept an old segment from a
   previous incarnation as described in Figure 2 above.  No duplication
   of segments is required within the testbed, unlike demonstration of
   TIME-WAIT Assassination.

2.4 Other Hazards

   Two other hazards exist as a result of RST-terminated connections; a
   de-synchronised connection as a result of an old ACK that is accept-
   able but acknowledges something not yet sent, and connection failure,
   also as a result of receiving an old ACK. The ACKs, like data, need
   not be duplicate segments. [RFC-1337] shows how these two hazards,
   referred to as H2 and H3, occur; this memo concentrates on examples
   of the hazard, referred to as H1 in [RFC-1337], of erroneous accep-
   tance of old segments containing data.

2.5 Relative Probabilities

   Although RSTs are less common than FINs as a means of closing connec-
   tion, the likelihood of data arriving after closedown is higher.
   Appendix B derives a ratio of probability based on observed traffic
   statistics.  Though an informal analysis, it implies that there is a
   significant risk in using RSTs to close connections.





















Heavens                                                        [Page 12]

Internet Draft          RSTs Considered Harmful                June 1996


3. TIME-WAIT after RST Transmission

   One solution to the dangers presented in the previous section
   involves the extension of the TIME-WAIT state to RST-terminated con-
   nections. This turns out to offer only partial protection against
   data corruption.

   TIME-WAIT state must be entered by the TCP endpoint that sends the
   RST; if the receiver enters TIME-WAIT, loss of the RST means that
   there is no TIME-WAIT state and the risk of data corruption still
   exists.

   A connection in any of SYN-RECVD, ESTABLISHED, FIN-WAIT-1, FIN-WAIT-
   2, CLOSING and CLOSE-WAIT states enters TIME-WAIT state on transmis-
   sion of a RST, rather than CLOSED.  Reception of a RST causes a tran-
   sition to CLOSED as in [RFC-793]. Minor modifications to the seman-
   tics of TIME-WAIT are required: if entered after RST transmission,
   reception of all further valid non-RST segments elicits a RST, rather
   than an ACK, and the TIME-WAIT timer is restarted.  Received RSTs are
   ignored in TIME-WAIT, as proposed by fix F1 in [RFC-1337].































Heavens                                                        [Page 13]

Internet Draft          RSTs Considered Harmful                June 1996


3.1 User Abort with TIME-WAIT

   This solution is shown in Figure 6 for the case of User Abort in
   ESTABLISHED state.  The hazards outlined in Figures 2 and 3 are less
   likely to occur.

              TCP A                                                 TCP B

     1. ESTABL.  --> <SEQ=400><ACK=101><DATA=80><CTL=ACK> -->     ESTABL.

     2.              ...  <SEQ=101><ACK=480><CTL=ACK>       <--   ESTABL.

                                                                 (User Abort)
     3.                       ...  <SEQ=101><CTL=RST>        <--  TIME-WAIT

     4. ESTABL.  --> <SEQ=480><ACK=101><DATA=80><CTL=ACK> ...

     5. ESTABL.  <-- <SEQ=101><ACK=480><CTL=ACK> ...

     6. ESTABL.  --> <SEQ=560><ACK=101><DATA=80><CTL=ACK> ...

     7. ESTABL.  --> <SEQ=640><ACK=101><DATA=80><CTL=ACK> ...

     8. CLOSED   <--  <SEQ=101><CTL=RST> ...

     9.          ... <SEQ=480><ACK=101><DATA=80><CTL=ACK>    -->  TIME-WAIT

     10. CLOSED          <--  <SEQ=101><CTL=RST>             <--  TIME-WAIT

     11.          ... <SEQ=560><ACK=101><DATA=80><CTL=ACK>    --> TIME-WAIT

     12. CLOSED          <--  <SEQ=101><CTL=RST>             <--  TIME-WAIT

     13.          ... <SEQ=560><ACK=101><DATA=80><CTL=ACK>    --> TIME-WAIT

     14. CLOSED          <--  <SEQ=101><CTL=RST>             <--  TIME-WAIT

     15.                                                           (2 MSL)
                                                                   CLOSED
                  Figure 6.  Connection Closed by User Abort











Heavens                                                        [Page 14]

Internet Draft          RSTs Considered Harmful                June 1996


   The solution outlined above offers partial protection against data
   corruption hazards arising from RST-terminated connections.  However,
   delay or loss of a RST gives rise to a potential hazard.

   For TIME-WAIT state to provide full protection, it must commence
   after both ends of a connection have stopped transmitting data.  This
   is guaranteed for the peer that enters TIME-WAIT, since it has
   transmitted a RST and no data can follow this.  The transition to
   TIME-WAIT must also take place after the other peer has ceased data
   transmission.  The 3-way closing handshake enforces this for conven-
   tionally closed connections;  TIME-WAIT state is always entered after
   the CLOSE-WAIT to LAST-ACK transition at the last peer to transmit
   data.

   The lack of an equivalent mechanism for RST-terminated connections
   leads to situations where the effective TIME-WAIT state is truncated
   or vanishes completely.

3.2 RST Loss and Data Retransmission

   Figure 7 shows a scenario where TCP A is retransmitting data seg-
   ments, lost because of network congestion.  Owing to exponential
   backoff, as described in [RFC-1122], the interval between successive
   retransmissions is now the 60 second limit common to many TCP imple-
   mentations.  TCP B gives up and aborts the connection, entering
   TIME-WAIT state as mandated by the partial solution in chapter 3.
   The ensuing RST is lost, as the network is still congested.   TCP A
   continues to retransmit.  At some point network congestion eases, and
   a retransmitted data segment reaches TCP B.  A new incarnation of the
   connection may be in existence, and the data segment may be errone-
   ously accepted.




















Heavens                                                        [Page 15]

Internet Draft          RSTs Considered Harmful                June 1996



              TCP A                                                 TCP B

     1. ESTABL.  --> <SEQ=400><ACK=101><DATA=80><CTL=ACK>         ESTABL.
                   (lost)
                                                                 (User Abort)
     2.                       ...  <SEQ=101><CTL=RST>        <--  TIME-WAIT
                           (lost)

        (RTX after 60 seconds)
     3. ESTABL.  --> <SEQ=400><ACK=101><DATA=80><CTL=ACK>         TIME-WAIT
                   (lost)

        (RTX after 60 seconds)
     4. ESTABL.  --> <SEQ=400><ACK=101><DATA=80><CTL=ACK>         TIME-WAIT
                   (lost)

        (RTX after 60 seconds)
     5. ESTABL.  --> <SEQ=400><ACK=101><DATA=80><CTL=ACK>         TIME-WAIT
                   (lost)

        (RTX after 60 seconds)
     6. ESTABL.  --> <SEQ=400><ACK=101><DATA=80><CTL=ACK>         TIME-WAIT
                   (lost)
                                                                  (2 MSL)
     7.                                                           CLOSED

        (RTX after 60 seconds)
     8. ESTABL.  --> <SEQ=400><ACK=101><DATA=80><CTL=ACK> ...

                   Figure 7.  RST Loss and Data Retransmission

3.3 RST Loss and Idle Connections

   It is not necessary for data transmission to be in progress for the
   above hazard to occur.  Consider the case where the user aborts an
   idle connection, as shown in Figure 8.  TCB B issues the abort, and
   enters TIME-WAIT.  The RST is lost, so that TCP A remains in ESTA-
   BLISHED state.  No activity occurs until TCP A tries to transmit
   data, an interval that is unbounded, and so may exceed twice the MSL.
   The data segment may be erroneously accepted at TCP B by a subsequent
   incarnation of the connection.









Heavens                                                        [Page 16]

Internet Draft          RSTs Considered Harmful                June 1996



              TCP A                                                 TCP B

     1. ESTABL.                                                   ESTABL.

                                                                  (User Abort)
     2.                       ...  <SEQ=101><CTL=RST>        <--  TIME-WAIT
                           (lost)
                                                                  (2 MSL)
     3.                                                           CLOSED

      (Interval > 2MSL)
     4. ESTABL.  --> <SEQ=400><ACK=101><DATA=80><CTL=ACK> ...

                   Figure 8.  RST Loss and Idle Connections




































Heavens                                                        [Page 17]

Internet Draft          RSTs Considered Harmful                June 1996


   Security Considerations

   Security issues are not discussed in this memo.

   References

   [Congestion]
      V. Jacobson, "Congestion Avoidance and Control," ACM SIGCOMM-88,
      August 1988.

   [RFC-792]
      J. Postel, "Internet Control Message Protocol", RFC-792,
      USC/Information Sciences Institute, September 1981.

   [RFC-793]
      Postel, J., "Transmission Control Protocol", RFC-793,
      USC/Information Sciences Institute, September 1981.

   [RFC-959]
      J. Postel, J. Reynolds, "File Transfer Protocol", RFC-959, ISI,
      October 1985.

   [RFC-1122]
      R. Braden, "Requirements for Internet hosts - communication
      layers", October 1989.

   [RFC-1185]
      Jacobson, V., Braden, R., and Zhang, L., "TCP Extension for High-
      Speed Paths", RFC-1185, Lawrence Berkeley Labs, USC/Information
      Sciences Institute, and Xerox Palo Alto Research Center, October
      1990.

   [RFC-1191]
      J. Mogul, S. Deering, "Path MTU Discovery", RFC-1191, November
      1990.

   [RFC-1323]
      Jacobson, V., Braden, R. and D. Borman "TCP Extensions for High
      Performance", RFC-1323, Lawrence Berkeley Labs, USC/Information
      Sciences Institute, and Cray Research, May 1992.

   [RFC-1337]
      R. Braden, "TIME-WAIT Assassination Hazards in TCP", RFC-1337,
      ISI, May 1992.

   [TCP/IP-Illustrated]
      Gary Wright & Richard Stevens, "TCP/IP Illustrated, Volume 2",
      Addison-Wesley 1995.



Heavens                                                        [Page 18]

Internet Draft          RSTs Considered Harmful                June 1996


Acknowledgements

Thanks to Alan Cox and Jon Crowcroft for their comments on previous
expanded versions of this memo, and to Bob Braden for [RFC-1337], which
stimulated ideas leading to it.

Author's Address:

   Ian Heavens
   Fore Systems Inc.
   2475 The Crescent,
   Solihull Parkway
   Birmingham Business Park
   B37 7YE
   United Kingdom

   Phone: +44 (0)121 717 4444
   Fax:   +44 (0)121 717 4455
   Email: iheavens@fore.co.uk
































Heavens                                                        [Page 19]

Internet Draft          RSTs Considered Harmful                June 1996


4. Appendix A: A Different Interpretation of RFC-1122

   There are problems with interpreting [RFC-1122] to respond to the
   arrival of data after half duplex close with a RST and no state
   change.  The connection hangs if data arrives at TCP A in FIN-WAIT-2,
   as Figure 9 shows.

          TCP A                                                TCP B

      1.  ESTABLISHED                                          ESTABLISHED

          (Close)
      2.  FIN-WAIT-1  --> <SEQ=100><ACK=300><CTL=FIN,ACK>  --> CLOSE-WAIT

      3.  FIN-WAIT-2  <-- <SEQ=300><ACK=101><CTL=ACK>      <-- CLOSE-WAIT

      4.  FIN-WAIT-2  <-- <SEQ=300><ACK=101><DATA=30>      <-- CLOSE-WAIT
                   (user data after half duplex close)

      5.  FIN-WAIT-2   --> <SEQ=301><ACK=131><CTL=RST>      --> CLOSED

            Figure 9.  Data Received in FIN-WAIT-2 after Half Duplex Close

   If the ACK of the FIN is lost or delayed, and data arrives in FIN-
   WAIT-1, the connection terminates without entering TIME-WAIT state.
   This is shown in Figure 10.

          TCP A                                                TCP B

      1.  ESTABLISHED                                          ESTABLISHED

          (Close)
      2.  FIN-WAIT-1  --> <SEQ=100><ACK=300><CTL=FIN,ACK>  --> CLOSE-WAIT

      3.     (lost)   ... <SEQ=300><ACK=101><CTL=ACK>      <-- CLOSE-WAIT

      4.  FIN-WAIT-1  <-- <SEQ=300><ACK=100><DATA=30>      <-- CLOSE-WAIT
                           (user data after half duplex close)

      5.  FIN-WAIT-1      --> <SEQ=101><CTL=RST>           --> CLOSED

      6.  FIN-WAIT-1  --> <SEQ=100><ACK=300><CTL=FIN,ACK>  --> CLOSED

      7.  CLOSED      <-- <SEQ=300><CTL=RST>               <-- CLOSED

             Figure 10.  Data Received in FIN-WAIT-1 after Half Duplex Close





Heavens                                                        [Page 20]

Internet Draft          RSTs Considered Harmful                June 1996


5. Appendix B : Relative Probabilities of Hazards

5.1 Introduction

   This section contains a less than rigorous analysis of the relative
   probabilities of the various data corruption hazards.  Note that
   these probabilities are zero for TCP connections operating below 250
   kbytes/second; the initial sequence number selection protects against
   data corruption hazards, regardless of the mechanism for closing the
   connection.

5.2 FIN, RST, Timer and ICMP Related Hazards

   It is useful to compare the relative probabilities of hazards arising
   from FIN-, RST-, Timer- and ICMP-terminated TCP connections.

   The probability of each hazard is proportional to the amount of data
   received after transition to CLOSED.  Complete protection requires
   that this be guaranteed to be zero.  Data received after connection
   closure does not cause data corruption, unless it falls within the
   current window of a new incarnation of the connection.

   It is assumed that the connection peer displaying the hazard is act-
   ing as a data sink, maximising the data received and the probability
   of failure.  If the proportion of TCP connections acting as data
   sinks or data sources is the same regardless of how the connection
   terminates, the relative probabilities remain the same.

   To simplify the arithmetic, higher order effects are ignored; for
   instance, those arising from the loss of more than one TCP segment in
   the period considered.

   The three hazards considered are data corruption arising from the
   following:

   o  Hazard 1: A FIN-terminated TCP connection with TIME-WAIT state
      omitted.

   o  Hazard 2: A TCP connection aborted from Established state, with
      neither TIME-WAIT nor LAST-ACK states.

   o  Hazard 3: A TCP connection aborted from Established state, with
      TIME-WAIT but without LAST-ACK state.

   Other hazards, such as connections aborted during closedown, by
   timeouts, or ICMP messages, are ignored.  These are much less likely
   than Hazard 2.  The duration of closedown is typically much shorter
   than that of Established state.  Timeouts require multiple loss of



Heavens                                                        [Page 21]

Internet Draft          RSTs Considered Harmful                June 1996


   segments in the network and represent higher order effects, with
   correspondingly lower probabilities.  ICMP termination of synchron-
   ised connections is very rare.

   Nomenclature

           P1 - Relative probability of Hazard 1
           P2 - Relative probability of Hazard 2
           PL - Probability of loss of a TCP segment in the network
           PR - Probability that a TCP connection terminates by RST
           PT - Probability that a TCP connection terminates by timeout
           PI - Probability that a TCP connection terminates by ICMP message
           MSS - Maximum Segment Size
           W -  Maximum offered TCP window


   o  Hazard 1

      Duplicate segments received after FIN-terminated connections usu-
      ally arise because of the loss of an ACK, triggering an unneces-
      sary retransmission.  Slow start [Congestion] implies that only
      one segment will be retransmitted without acknowledgement.  The
      relative probability of H1 is the segment size multiplied by the
      probability of segment loss and the probability of termination by
      FIN handshake:

              P1 = MSS * PL * (1 - PR - PT - PI) = MSS * PL

      ignoring higher order effects.

   o  Hazard 2

      For a data sink, transmission of a RST in Established state and
      transition to CLOSED state is followed by reception of up to a
      window of data, all of which may be received during a subsequent
      incarnation of the connection.

      The relative probability of H2 is the window size multiplied by
      the probability of termination by RST:

              P2 = W * PR


   o  Hazard 3

      In this case, a RST is lost.  Any data received in TIME-WAIT
      causes the TIME-WAIT timer to restart, so the hazard only occurs
      if the gap between reception of segments exceeds the duration of



Heavens                                                        [Page 22]

Internet Draft          RSTs Considered Harmful                June 1996


      TIME-WAIT state.   This occurs if several retransmitted segments
      are lost, which is a higher order effect with low probability, or
      if an application spontaneously transmits data after this time,
      which is also unlikely. This hazard can be ignored.

5.3 Relative Probabilities of FIN- and RST-related Hazards

   The ratio of probabilities of hazard H2 and H1 is

           P2/P1 =  W/MSS * PR/PL

   Example Calculation

   If Path MTU Discovery [RFC-1191] is supported, the segment size is
   the Maximum Segment Size indicated by the lowest physical packet size
   on the connection path, unless negotiated to be lower during connec-
   tion establishment.  Implementation of [RFC-1191] is not yet
   widespread, so the default figure is assumed [RFC-1122, 3.3].

           TCP segment size = 576 - size of TCP and IP headers = 536

   Assume a window size of 32K.  Appendix C summarises statistics about
   TCP connections, derived from a variety of connections.  Taking the
   average percentage values of PR=1.1 and PL=1.2 derived from Appendix
   C:

           P2/P1 =  W/MSS * PR/PL = 32768/536 * 1.1/1.2 = 56.

   For TCP connections on the same physical network, or where Path MTU
   Discovery is supported, the default segment size is larger and rela-
   tive probability smaller.

   The lowest ratio consistent with the data in Appendix C can be calcu-
   lated from the highest value of PL (2.9) and the lowest value of PR
   (0.8):

   P2/P1 = 17.

   It can be concluded that erroneous acceptance of data from expired
   connections is significantly more likely to occur as a result of
   RST-terminated connections than the equivalent hazard after FIN-
   terminated connections.









Heavens                                                        [Page 23]

Internet Draft          RSTs Considered Harmful                June 1996


6. Appendix C:Traffic Statistics for TCP Connections

   Statistics were measured using the netstat program on six machines:

   [1] A home workstation (VMS) used for telecommuting via a 56Kb Frame
      Relay link to the Internet.

   [2] A DNS and mail gateway (VMS) at the University of Tucson,
      Arizona.

   [3] A personal workstation (SunOS 4.1.3) on Spider Systems' (now
      Shiva Corporation) corporate LAN.

   [4] The BSD development system (BSD4.4-Lite) at the Computer Science
      department, Berkeley, California (taken from [TCP/IP-Illustrated],
      p.799).

   [5] A file server (SunOS 4.1.3) on Spider Systems' corporate LAN.

   [6] An application gateway (SunOS 4.1.3) between Spider Systems' cor-
      porate LAN and the Internet.

   The columns show statistics collected by the BSD netstat utility or
   its VMS equivalent, with the exception of machine uptime.  The
   derivation of the statistics from the BSD TCP/IP "tcpstat" structure
   is shown in parentheses.

   o machine (M)

   o time in days that the machine has been up (U)

   o number of TCP connections established (tcpstat.tcp_connects).

   o number of TCP connections aborted by RST transmission, expressed as
     a sum of the total aborted excluding those aborted by reception of
     data after half duplex close, and those aborted after half duplex
     close ((tcpstat.tcps_drops - tcpstat.tcps_rcvafterclose) +
     tcpstat.tcps_rcvafterclose).

   o number of TCP connections timed out expressed as a sum of the
     number timed out by retransmissions and keepalives
     (tcpstat.tcps_timeoutdrop + tcpstat.tcps_keeptimeo).

   o total number of TCP data segments transmitted, excluding
     retransmissions (tcpstat.tcps_sndpack -
     tcpstat.tcps_sndrexmitpack).

   o total number of TCP data segments retransmitted



Heavens                                                        [Page 24]

Internet Draft          RSTs Considered Harmful                June 1996


     (tcpstat.tcps_sndrexmitpack).


           M  U    Establ. Dropped     Timed Out   TXed Segs   RTXed Segs.
           1  2    408     4+1         263+1       135168      250
           2  5    46632   456+102     7338+551    317523      4756
           3  ?    138682  13349+3686  79+2345     22761633    104440
           4  30   126820  44+1017     86+3219     8920528     257295
           5  20   13557   198+205     43+28       1559505     1675
           6  14   48226   3943+1396   11+190      11505576    67401


   Percentage values for aborted and timed out connections, and for seg-
   ment loss, are as follows.


           Machine Dropped (%)     Timed Out (%)   Retransmissions (%)
           1       1.2             64.7            0.18
           2       1.2             16.9            1.50
           3       12.3            1.75            0.46
           4       0.8             2.60            2.88
           5       3.0             0.52            0.11
           6       11.1            0.42            0.59


   Machine 3 and 5 are internal to a LAN and mostly handle NFS traffic,
   so may be expected to have different patterns of connection estab-
   lishment and segment losses.  Dropped connections for machine 6 are
   such a high proportion that some pathological system or application
   problem can be suspected.  These machines are excluded from calcula-
   tions.

   Aborted connections yield more consistent percentages than timeouts
   and segment loss rates; this may be because the latter are more sus-
   ceptible to the characteristics of nearby networks, whereas aborts
   are a function of application or system behaviour.  For instance, an
   excessive proportion of machine 1's TCP connections expire because of
   retransmission timeouts; this may be due to an unreliable link.

   For machines 1, 2 and 4, the average percentage drop rate is 1.1%.
   The average retransmission rate is 1.2%.    The lowest percentage
   drop rate is 0.8%, and the highest retransmission rate is 2.9%.









Heavens                                                        [Page 25]