Internet DRAFT - draft-heitzhe-tcpm-vm-rto

draft-heitzhe-tcpm-vm-rto



TCPM Working Group                                             J. Heitz
Internet Draft                                                    Cisco
                                                               Chuan He
                                                               Ericsson
Intended status: Informational                         October 19, 2014
Expires: April 2015



               TCP Retransmission Timer for Virtual Machines
                     draft-heitzhe-tcpm-vm-rto-00.txt


Abstract

   A Round Trip Time (RTT) estimate that decays performs badly in a
   bursty environment. A round trip time estimator that does not decay
   for a period of time is proposed.

   It does not require a minimum value to be configured. It works
   equally well when the typical RTT is 100uS as when it is 10 seconds.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This Internet-Draft will expire on March 19, 2009.







Heitz, et al.           Expires April 19, 2015                 [Page 1]

Internet-Draft              TCP RTO for VM                 October 2014


Copyright Notice

   Copyright (c) 2014 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.

Table of Contents


   1. Introduction...................................................2
   2. The Algorithm..................................................3
   3. Delayed ACK Timer..............................................4
   4. Backing off the timer..........................................4
   5. Security Considerations........................................4
   6. IANA Considerations............................................4
   7. References.....................................................4
      7.1. Normative References......................................4
   8. Acknowledgments................................................4

1. Introduction

   Virtual machines can create bursty environments for TCP, especially
   if the TCP also runs in a thread within a process of a busy host
   machine. Round trip time measurements can frequently be hundreds of
   times the average. A retransmission timer (RTO) calculation that
   decays even a little bit for each small measured RTT will cause a
   retransmission for every such outlier RTT.

   [RFC6298] requires a minimum RTO of 1 second to avoid spurious
   retransmissions. RTTs between VMs in a lightly loaded host are
   regularly less than 1 millisecond. On a heavily loaded host, RTTs do
   not all get higher. They get a little higher in the median, to a few
   milliseconds, but the number of spikes of 100s of milliseconds
   increases. The EWMA algorithm of [RFC6298] is hardly used. It simply
   defaults to 1 second. When the default is reduced, it gets a spurious
   retransmission on nearly every spike.

   The proposed algorithm is less aggressive than that of [RFC6298] and
   needs no minimum setting. In fact, it grew out of an attempt to find
   a better minimum.



Heitz, et al.           Expires April 19, 2015                 [Page 2]

Internet-Draft              TCP RTO for VM                 October 2014


   The retransmission timer is a compromise. If it is set too low, then
   spurious retransmissions occur, but if it is set too high, then it
   takes too long to retransmit when it is really needed.

   The right balance is achieved when an acceptably small number of
   spurious retransmissions occur.

2. The Algorithm

   The basis of the algorithm is as follows: Time is divided into
   intervals. Within each interval, the highest RTT is determined. This
   RTT forms the RTO to be used for the next interval. The RTO is
   constant for the duration of an interval.

   The end of an interval and the beginning of the next one is
   determined when any of the following events occur:

   1. An RTT is measured that is greater than the maximum RTT from the
      previous interval. The maximum RTT from the previous interval is
      the RTT that determines the RTO of the current interval. This
      measured RTT is the greatest RTT measured for the current
      interval. It is considered part of the current interval, not of
      the next one.

   2. A large number (suggest 20) of windows of data has been
      transmitted.

   The RTO of one interval is the maximum RTT of the previous interval
   plus some headroom. The suggested headroom is 1/4, so RTO = 1.25 *
   (previous max RTT).

   A window of data is the largest ever advertised window of a session.

2.1. Initial Interval

   The RTO of the first interval should be 1 second. The length of the
   first interval should be shorter than the others. Suggestion is 3 RTT
   measurements. The initial RTO may alternatively be determined from a
   history of previous connections.

   An alternative is to run the regular algorithm as in [RFC6298] during
   the first interval. The RTTs would still be individually measured in
   preparation for the second interval.






Heitz, et al.           Expires April 19, 2015                 [Page 3]

Internet-Draft              TCP RTO for VM                 October 2014


3. Delayed ACK Timer

   The delayed ACK timer is not handled differently. If a delayed ACK
   timer is in effect on the peer, it may cause high RTT measurements.
   If delayed ACK happens less than every 20 windows, then it will be
   included as part of the maximum RTT measurement. If it happens after
   more than 20 windows of data have been transmitted, then a possibly
   resulting retransmission is not excessive.

4. Backing off the timer

   This document is an alternative to section 2 of [RFC6298]. In
   particular, the backing off mechanism in section 5 remains intact.



5. Security Considerations

   No security issues beyond those outlined in [RFC6298] have been
   identified.

6. IANA Considerations

   None

7. References

7.1. Normative References

    [RFC6298] V. Paxson, M. Allman, J. Chu and M. Sarent, "Computing
             TCP's Retransmission Timer", RFC 6298, June 2011.

8. Acknowledgments

   This document was prepared using 2-Word-v2.0.template.dot.

Authors' Addresses

   Jakob Heitz
   Cisco
   510 McCarthy Blvd,
   Milpitas, CA 95035

   Email: jheitz@cisco.com





Heitz, et al.           Expires April 19, 2015                 [Page 4]

Internet-Draft              TCP RTO for VM                 October 2014


   Chuan He
   Ericsson
   300 Holger Way,
   San Jose, CA 95134

   Email: chuan.he@ericsson.com











































Heitz, et al.           Expires April 19, 2015                 [Page 5]