Network Working Group                                      Reiner Ludwig
INTERNET-DRAFT                                             Michael Meyer
Expires: May 2002                                      Ericsson Research
                                                          November, 2001







                     The TCP Retransmit (RXT) Flag
                <draft-ludwig-tsvwg-tcp-rxt-flag-03.txt>


Status of this memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or cite them other than as "work in progress".

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/lid-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html


Abstract

   This document proposes a solution to TCP's retransmission ambiguity
   problem. It is based on using a single bit, named the Retransmit
   (RXT) flag, taken from the Reserved field of the TCP header. The TCP
   sender sets the RXT flag in segments containing retransmitted data.
   In response to such a segment, the TCP receiver sends an immediate
   pure ACK with the RXT flag set. By inspecting the RXT flag of the
   ACKs that arrive after a retransmit the TCP sender can then resolve
   the retransmission ambiguity. This protocol feature provides a basis
   for future TCP enhancements such as schemes to detect and respond to
   spurious timeouts and packet re-orderings.





Ludwig & Meyer                                                  [Page 1]








INTERNET-DRAFT       The TCP Retransmit (RXT) Flag       November, 2001


Terminology

   The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
   SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this
   document, are to be interpreted as described in [RFC2119].

   We use the term 'pure ACK' to refer to an ACK that is not piggy
   backed onto a data segment. We use the term 'immediate ACK' to refer
   to an ACK that the TCP receiver sends immediately in response to the
   arrival of a data segment. That is, without waiting for the delayed-
   ACK timer [RFC1122] to expire. We use the term 'new ACK' to refer to
   an ACK that acknowledges outstanding data. Furthermore, we refer to
   the first-time transmission of a data segment as the 'original
   transmit'.


1. Introduction

   The retransmission ambiguity problem [KP87] is the TCP sender's
   inability to distinguish whether the first new ACK that arrives after
   a retransmit was sent in response to the original transmit or the
   retransmit. This leads to additional problems such as spurious
   retransmits and unnecessary load reductions that occur after spurious
   timeouts or packet re-orderings [LK00]. Another consequence is that
   the RTT estimators cannot be updated when the segment that is timed
   has been retransmitted [KP87].

   One way of eliminating the retransmission ambiguity in TCP is the use
   of the Timestamps option field [RFC1323]. A segment's timestamp can
   be viewed as its serial number that is echoed in the corresponding
   ACK. Some BSD-derived TCP implementations make use of this in that
   the RTT estimators are updated even for retransmitted segments.

   However, the Timestamps option field adds considerable overhead: 12
   bytes that are added to every data segment and to every ACK.
   Moreover, the Timestamps option field is not compressed by current
   TCP/IP header compression schemes [RFC2507], and even effectively
   disables other widely deployed TCP/IP header compression schemes
   [RFC1144].

   This document defines a more efficient solution, named the RXT
   scheme, to resolve the retransmission ambiguity in TCP. It is based
   on using a single bit, named the Retransmit (RXT) flag, taken from
   the Reserved field of the TCP header. The TCP sender sets the RXT
   flag in segments containing retransmitted data. In response to such a
   segment, the TCP receiver sends an immediate pure ACK with the RXT
   flag set. By inspecting the RXT flag of the ACKs that arrive after a
   retransmit the TCP sender can then resolve the retransmission
   ambiguity. This protocol feature provides a basis for future TCP
   enhancements such as schemes to detect and respond to spurious
   timeouts and packet re-orderings (e.g., see [Lu01], [LG01]).



Ludwig & Meyer                                                  [Page 2]








INTERNET-DRAFT       The TCP Retransmit (RXT) Flag       November, 2001


   Also, the DSACK option [RFC2883] may be used to detect spurious
   retransmits. However, the DSACK option cannot be used to eliminate
   the retransmission ambiguity. The reason is that the first new ACK
   for a retransmit will commonly not carry a DSACK option. That is, the
   DSACK option commonly arrives one or more ACKs later than the first
   new ACK for a retransmit. This is independent of whether the
   retransmit was genuine or spurious.


2. RXT-Flag-Permitted Option and Initial Handshake

   Definition of the two-byte TCP RXT-Flag-Permitted option:

          +---------+---------+
          | Kind=6  | Length=2|
          +---------+---------+

   When a TCP sends a SYN segment, it MAY include the RXT-Flag-Permitted
   option in the SYN. This is defined as an indication that the TCP
   sending the SYN segment wishes to participate in the RXT scheme as
   both a sender and a receiver.

   When a TCP receives a SYN segment that includes the RXT-Flag-
   Permitted option, it MAY also include the RXT-Flag-Permitted option
   in the SYN-ACK. This is defined as an indication that the TCP sending
   the SYN-ACK segment agrees to participate in the RXT scheme as both a
   sender and a receiver.

   The RXT-Flag-Permitted option MUST NOT be sent in a segment that is
   not a SYN or SYN-ACK.


3. Definition of the RXT Flag

   We define bit 6 in the Reserved field of the TCP header as the RXT
   flag. The location of the 6-bit Reserved field in the TCP header is
   shown in Figure 3 of [RFC793]. Bit 8 and 9 of the Reserved field have
   been assigned to the Explicit Congestion Notification (ECN) [RFC3168]
   while bit 7 is under discussion to be assigned to the nonce scheme
   proposed in [SWE01].


4. TCP Sender

   If both TCPs have agreed to participate in the RXT scheme, the TCP
   sender SHOULD set (binary 1) the RXT flag in retransmits. This
   includes retransmits of data segments, SYNs and FINs. The TCP sender
   SHOULD reset (binary 0) the RXT flag in non-retransmits.






Ludwig & Meyer                                                  [Page 3]








INTERNET-DRAFT       The TCP Retransmit (RXT) Flag       November, 2001


   Else, if the TCP sender has not received a RXT-Flag-Permitted option
   in the SYN-ACK, the TCP sender MUST NOT make use the RXT flag of
   arriving ACKs.


5. TCP Receiver

   If both TCPs have agreed to participate in the RXT scheme, the TCP
   receiver MUST send an immediate pure ACK with the RXT flag set
   (binary 1) in response to a segment that arrived with the RXT flag
   set. In particular, the RXT flag is set in a SYN-ACK (FIN-ACK) when
   sent in response to a SYN (FIN) that arrived with the RXT flag set.

   Note, that the immediate pure ACK might also return to the TCP sender
   as a duplicate ACK. This will typically occur after a spurious
   retransmit.

   In all other situations where a pure ACK is sent, the TCP receiver
   MUST reset (binary 0) the RXT flag.


6. Interpreting the RXT Flag

   When using the RXT scheme it should be carefully considered what
   information exactly the RXT flag conveys. Whether or not the RXT flag
   is set in a data segment or an ACK, allows to reliably draw certain
   conclusions. In some situations, however, it only indicates with some
   certainty that a particular event has occurred. This is discussed in
   the following assuming that the RXT scheme is correctly implemented
   in both TCPs.

   When a new ACK with the RXT flag not set (binary 0) arrives after a
   retransmit, the TCP sender can reliably conclude that that ACK was
   not triggered by a retransmit. Furthermore, the TCP sender may take
   that ACK as a strong indication that the retransmit was spurious,
   i.e., that the original transmit arrived at the TCP receiver. This is
   only a strong indication, and cannot be concluded with absolute
   certainty. This is stressed since the following counter-example has
   been pointed out to the authors:

     In case the original transmit was dropped in the network, a new ACK
     that arrives after a retransmit would also not have the RXT flag
     set (binary 0) if (1) the retransmit arrived at the TCP receiver in
     sequence, i.e., if it had jumped ahead of all data segments that
     were outstanding when the retransmit was sent, and if (2) the ACK
     for the retransmit with the RXT flag set (binary 1) got lost. In
     that case the mentioned new ACK would correspond to one of the data
     segments that were outstanding when the retransmit was sent. Note:
     This example holds independent of whether the loss recovery phase
     was triggered by the arrival of the third duplicate ACK or by a
     timeout.



Ludwig & Meyer                                                  [Page 4]








INTERNET-DRAFT       The TCP Retransmit (RXT) Flag       November, 2001



   However, this counter-example might be regarded as a rather
   pathological case. In addition, it seems to be the only conceivable
   counter-example. Hence, it might be regarded as safe to assume that
   the mentioned new ACK indicates that the retransmit was spurious.

   The RXT flag is useful to distinguish between an ACK for an original
   transmit and an ACK for a retransmit. However, the single bit does
   not help in deciding to which retransmit an ACK corresponds in case
   multiple retransmits of the same data have been sent. A "RXT counter"
   allocating at least two bits would be required to allow for that. For
   situations where this becomes an issue, e.g., round-trip time
   estimation in environments with high packet loss rates, the use of
   the Timestamps option [RFC1323] is a suitable alternative.
   Nevertheless, the RXT flag would allow the TCP sender to disable
   Karn's algorithm [RFC1122] as long as the same segment has only been
   retransmitted once.


7. Security Considerations

   There do not seem to be any security considerations associated with
   the RXT scheme itself. This is because the RXT scheme is only a
   signaling scheme that is not tied to any specific action that might
   alter protocol state at the TCP sender or receiver.

   However, security considerations might exist for schemes that use the
   RXT scheme as a basis. In particular, it needs to be considered that
   the TCP receiver might by lying about the RXT flag.


Acknowledgments

   Many thanks to Keith Sklower, Randy Katz, Stephan Baucke, Sally
   Floyd, Vern Paxson, Mark Allman, Ethan Blanton, and Andrei Gurtov for
   very useful discussions that contributed to this work.

References

   [RFC1122] R. Braden, Requirements for Internet Hosts - Communication
             Layers, RFC 1122, October 1989.

   [RFC2119] S. Bradner, Key words for use in RFCs to Indicate
             Requirement Levels, RFC 2119, March 1997.

   [RFC2507] M. Degermark, B. Nordgren, S. Pink, IP Header Compression,
             RFC 2507, February 1999.

   [RFC2883] S. Floyd, J. Mahdavi, M. Mathis, M. Podolsky, A. Romanow,
             An Extension to the Selective Acknowledgement (SACK) Option
             for TCP, RFC 2883, July 2000.



Ludwig & Meyer                                                  [Page 5]








INTERNET-DRAFT       The TCP Retransmit (RXT) Flag       November, 2001



   [RFC1144] V. Jacobson, Compressing TCP/IP Headers for Low-Speed
             Serial Links, RFC 1144, February 1990.

   [RFC1323] V. Jacobson, R. Braden, D. Borman, TCP Extensions for High
             Performance, RFC 1323, May 1992.

   [KP87]    P. Karn, C. Partridge, Improving Round-Trip Time Estimates
             in Reliable Transport Protocols, In Proceedings of ACM
             SIGCOMM 87.

   [LK00]    R. Ludwig, R. H. Katz, The Eifel Algorithm: Making TCP
             Robust Against Spurious Retransmissions, ACM Computer
             Communication Review, Vol. 30, No. 1, January 2000.

   [Lu01]    R. Ludwig, The Eifel Algorithm for TCP, work in progress,
             November 2001.

   [LG01]    R. Ludwig, A. Gurtov, Responding to Spurious Timeouts in
             TCP, work in progress, November 2001.

   [RFC793]  J. Postel, Transmission Control Protocol, RFC 793,
             September 1981.

   [RFC3168] K. Ramakrishnan, S. Floyd, D. Black, The Addition of
             Explicit Congestion Notification (ECN) to IP, RFC 3168,
             September 2001.

   [SWE01]   N. Spring, D. Wetherall, D. Ely, Robust ECN Signaling with
             Nonces, work in progress, October 2001.

Author's Address

     Reiner Ludwig
     Ericsson Research (EED)
     Ericsson Allee 1
     52134 Herzogenrath, Germany
     Phone: +49 2407 575 719
     Fax:   +49 2407 575 400
     Email: Reiner.Ludwig@Ericsson.com

     Michael Meyer
     Ericsson Research (EED)
     Ericsson Allee 1
     52134 Herzogenrath, Germany
     Phone: +49 2407 575 654
     Fax:   +49 2407 575 400
     Email: Michael.Meyer@Ericsson.com


This Internet-Draft expires in May 2002.



Ludwig & Meyer                                                  [Page 6]