Internet DRAFT - draft-bonaventure-mptcp-backup
draft-bonaventure-mptcp-backup
MPTCP Working Group O. Bonaventure
Internet-Draft Q. De Coninck
Updates: 6824 (if approved) M. Baerts
Intended status: Experimental F. Duchene
Expires: January 7, 2016 B. Hesmans
UCLouvain
July 06, 2015
Improving Multipath TCP Backup Subflows
draft-bonaventure-mptcp-backup-00
Abstract
This document documents some issues with the current definition of
the backup subflows in [RFC6824]. The solution proposed in [RFC6824]
works well when a subflow completely fails. However, if a subflow
suffers from huge packet losses, but still remains up, then the delay
to switch to the backup subflow may be very long. We propose to
measure the evolution of the retransmission timer (RTO) to detect the
bad performance of subflows.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 7, 2016.
Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
Bonaventure, et al. Expires January 7, 2016 [Page 1]
Internet-Draft MPTCP backup July 2015
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. What is a Subflow Failure ? . . . . . . . . . . . . . . . . . 3
3. Detecting Underperforming Subflows . . . . . . . . . . . . . 5
4. Security considerations . . . . . . . . . . . . . . . . . . . 8
5. IANA considerations . . . . . . . . . . . . . . . . . . . . . 8
6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 9
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 9
8.1. Normative References . . . . . . . . . . . . . . . . . . 9
8.2. Informative References . . . . . . . . . . . . . . . . . 9
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10
1. Introduction
Multipath TCP is an extension to TCP [RFC0793] that was specified in
[RFC6824]. A Multipath TCP connection is composed of one or more
subflows. Each subflow is a TCP connection that is established by
using the classical TCP three-way handshake. The subflows that
compose a Multipath TCP connection are not all equal. [RFC6824]
defines two types of subflows:
o the regular subflows
o the backup subflows
The regular subflows can be used to transport any data. The backup
subflows are intended to be used only when all the regular subflows
have failed. Section 2.5 of [RFC6824] defines them by using the
following sentence: "Hosts can indicate at initial subflow setup
whether they wish the subflow to be used as a regular or backup path
- a backup path only being used if there are no regular paths
available."
Intuitively, a user expects that the backup subflow will be used when
the regular subflow fails to continue the data transfer and minimize
the impact of the failure on the Multipath TCP connection.
In this document, we first describe in Section 2 how Multipath TCP
operates when backup subflows are used and some of the operational
problems that this causes. Backup subflows work well when subflows
Bonaventure, et al. Expires January 7, 2016 [Page 2]
Internet-Draft MPTCP backup July 2015
completely fail due to, for example, the reception of a RST segment
or the invalidity of the IP address associated to the subflow
(expired lease time, de-attachment from network, etc.). However,
there are many practical situations where the failure of a regular
subflow cannot be quickly detected and the user experience suffers.
We then propose in Section 3 a slight modification to the handling of
the backup subflows in Multipath TCP.
2. What is a Subflow Failure ?
Experience with Multipath TCP shows that the backup subflows that are
only used when all the other subflows have failed works well on fixed
hosts where the loss of connectivity can be quickly detected by the
affected host. However, there are many situations where it can be
difficult to detect the failure of a regular subflow.
<----- primary subflow ----->
+----link1----router1-------router2---link2---+
| |
Client Server
| |
+----link3----router3-------router4---link4---+
<----- backup subflow ----->
Figure 1: Simple network
To understand the situation, let us consider the simple network shown
in Figure 1. In this network, the client has established two
subflows:
o a regular subflow passing through router1 and router2
o a backup subflow passing through router3 and router4
[RFC6824] supports two methods to signal that a subflow is a backup
subflow:
o setting the B bit in the MP_JOIN option that is used to create the
subflow
o sending the MP_PRIO option with the B bit set
Note that in both cases, when a host sets the B bit in the MP_JOIN or
sends an MP_PRIO option, it requests the other host to only use the
subflow if the other regular subflows have failed. Setting the B bit
Bonaventure, et al. Expires January 7, 2016 [Page 3]
Internet-Draft MPTCP backup July 2015
in the MP_JOIN option or sending the MP_PRIO option does not affect
the data sent by the host that sends this option [RFC6824].
Let us now consider three different failure scenarios. For
simplicity, we assume that all the data flows from the Server to the
Client and that the top subflow is the primary subflow while the
bottom subflow was signaled as a backup subflow.
Our first failure scenario is the simplest one: the failure of link1.
In this case, the Client detects the failure locally. This detection
can be fast with wired link layer technologies and slower with some
wireless technologies. Once the failure has been detected, the
Client can either send a REMOVE_ADDR option to indicate the failure
of its address attached to link1 or send an MP_PRIO option with the B
bit reset over the backup subflow. In both cases, a single segment
sent over the backup subflow is sufficient to inform the Server of
the failure of the primary subflow. Note that the REMOVE_ADDR and
the MP_PRIO options are sent unreliably. This implies that any loss
of these options will further delay the recovery on the Server.
Our second failure scenario is the symmetric scenario: the failure of
link2. In this case, the Server will react by sending a REMOVE_ADDR
option over the backup subflow to indicate the loss of the address
attached to this link. Since the Server knows that the primary
subflow has failed, it can immediately start to use the backup
subflow to send data to the Client. Experiments show that these two
failure scenarios work well [Cellnet12].
The third failure scenario is a failure of the link between router1
and router2. Different types of failures are possible on this link.
We consider two extreme cases. The first case is a pure link failure
that is detected by the two routers. Since there is no alternate
path between router1 and router2 in our example network, the Client
cannot reach the Server anymore over the top path. Once router1 and
router2 have detected the failure, they will return ICMP destination
unreachable messages to the Client and the Server. This error
message could suggest a failure of the primary subflow. According to
[RFC1122], this ICMP message should cause the termination of the top
subflow. However, according to [RFC5461], current TCP
implementations do not follow this recommendation and ignore the
received ICMP messages. This is motivated by the risk of denial of
service attacks that could disrupt existing TCP connections by
sending spoofed ICMP messages. A Multipath TCP implementation could
react differently and for example consider the subflow over which the
ICMP message was received as temporarily unusable to cause the
utilization of other (possibly backup) subflows.
Bonaventure, et al. Expires January 7, 2016 [Page 4]
Internet-Draft MPTCP backup July 2015
If a Multipath TCP implementation does not react to ICMP messages,
the last resort method to detect the failure of the top path is the
retransmission timer (RTO). TCP implementations apply an exponential
backoff algorithm to the retransmission timeout [RFC6298]. If the
primary path fails, the retransmission timeout associated to this
path will double until it reaches the maximum value configured on the
TCP stack. On many stacks, this limit is in the order of tens of
minutes which does not match the expectations of the Multipath TCP
user who expects that her backup subflow will be used earlier than
that. A similar situation occurs when the link between the two
routers remains up but is so congested that packets sent on the
regular subflow rarely traverse the link [BD2015]. In this case, the
user also expects to be able to quickly use the backup subflow to
preserve the end-to-end connectivity.
3. Detecting Underperforming Subflows
As explained in the previous section, users cannot accept a too long
delay to detect the failure of a regular subflow and the switch to an
existing backup subflow. [RFC6824] allows a host to specify that a
subflow is a backup subflow, but there is no definition of
underperfoming subflows and no mechanism to allow applications to
specify a switchover time to a backup subflow.
Various techniques exist to detect failures. Shim6 [RFC5533]
includes the REAP protocol [RFC5534] to verify the reachability of
addresses. BFD [RFC5880] is used to detect link failures between
routers and also over multihop paths [RFC5883]. Depending on the
chosen parameters, these protocols can achieve fast detection and/or
low overhead. We do not believe that additional protocols are
required to quickly detect the failure of a subflow. With its
retransmission timer that doubles after each unsuccessful
retransmission, Multipath TCP already has the ability to detect
underperforming subflows. If data is transmitted over a broken
subflow, the retransmission timer of this subflow will quickly
increase. These successive retransmissions are an appropriate
mechanism to detect the failure of a subflow and switch to a backup
one provided that the TCP retransmission timer does not become too
high.
[RFC0793] specifies an abstract API that allows user applications to
indicate bounds on the retransmission timer. [RFC5482] goes further
in by proposing a TCP option that can be used to signal a proposed
maximum value for the TCP retransmission timeout through the User
Timeout option [RFC5482]. This option specifies the maximum time
that some data can remain unacknowledged before considering the
connection to have failed. In [RFC5482], the User Timeout is encoded
as a 15 bits field that represents seconds or minutes. This implies
Bonaventure, et al. Expires January 7, 2016 [Page 5]
Internet-Draft MPTCP backup July 2015
that the User Timeout option cannot be used to signal a bound smaller
than 1 second.
With the User Timeout option, the TCP connection must be terminated
once its RTO reaches the signaled maximum value.
[RFC5482] defines the following parameters for the RTO:
o U_LIMIT: the upper limit on the USER TIMEOUT
o L_LIMIT: the lower limit on the USER TIMEOUT
In addition, the application can specify, e.g. through a socket
option, the USER TIMEOUT that it wishes to use and advertise to the
peer: ADV_UTO. Similarly, the REMOTE_UTO is the User Timeout option
received from the peer. Then, [RFC5482] defines the USER TIMEOUT
with the following formula:
USER_TIMEOUT = min(U_LIMIT, max(ADV_UTO, REMOTE_UTO, L_LIMIT))
[RFC6824] does not discuss precisely how the User Timeout option
should be handled if received over a Multipath TCP connection. If
this option is set through the regular socket API that does not
expose any information about the subflows, it must apply on the
overall Multipath TCP connection.
In this document, we envision an API that exposes some parts of
Multipath TCP to the application to enable them to make a better
utilisation of the features of the protocol. Such an API would
expose some information about the subflows to the applications.
A first possibility to control the performance of the subflows could
be to specify a USER_TIMEOUT on a per subflow basis and terminate the
subflows whose RTO has reached the USER_TIMEOUT. However,
terminating an underperforming subflow may be too severe in
environments where there are transient losses such as wireless
networks. An alternative approach is to tag the subflow as
underperforming and modify the operation of Multipath TCP.
According to [RFC6824], an established subflow can operate in two
modes :
o primary mode
o backup mode
The initial subflow is always created in primary mode. When a
subflow is created, its mode depends on the B bit of the received
Bonaventure, et al. Expires January 7, 2016 [Page 6]
Internet-Draft MPTCP backup July 2015
MP_JOIN option. The reception of the MP_PRIO option changes the mode
of the corresponding subflow. We a Multipath TCP implementation
sends data, it always selects one of the available primary subflows
to transmit the data. The backup subflows are only selected if there
is no established subflow in primary mode.
We propose a new mode of operation : the underperforming mode.
Subflows are still established in the primary or backup mode as
explained above. A subflow enters the underperforming mode as soon
as its retransmission timer (RTO) reaches a configurable limit. At
this point, the subflow is considered to be underperforming. An
underperforming subflow cannot be selected for data transmission if
there exists another subflow in primary or backup mode. Once a
subflow has been tagged as underperforming, it remains in this mode
as long as there are unacknowledged data on this subflow. Once all
data has been acknowledged, it may return to the primary or backup
mode. Further experimentation is required to evaluate how quickly an
underperforming subflow should leave the underperforming mode once
all data has been acknowledged.
System administrators and/or application developpers (e.g. through a
socket option) should be able to specify the maximum RTO that causes
a Multipath TCP subflow to be tagged as underperforming. For this,
we propose two new parameters:
o UPERF_ADV_TO: the upper threshold on the RTO that forces the
subflow to be considered as underperforming
o UPERF_REMOTE_TO: the upper threshold on the RTO received from the
remote peer
The UPERF_ADV_TO is configured locally on the host. It could be
configured globally or on a per connection basis. The configuration
applies to all subflows of a Multipath TCP connection.
The UPERF_REMOTE_TO is received in a Multipath TCP option. This
value applies only on the subflow over which it has been received.
The UPERF_TIMEOUT that is used to detect underperforming subflows is
then computed by using the following formula:
UPERF_TIMEOUT = min(U_LIMIT, max(UPERF_ADV_TO, UPERF_REMOTE_TO,
L_LIMIT))
If a USER_TIMEOUT is defined for the Multipath TCP connection, its
value MUST be larger than the UPERF_TIMEOUT.
Bonaventure, et al. Expires January 7, 2016 [Page 7]
Internet-Draft MPTCP backup July 2015
The UPERF_REMOTE_TO can be signaled by using a Multipath TCP option
to the remote peer. This document proposes the following
experimental option to encode this information (Figure 2 :
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+---------------+---------------+-------+-----------------------+
| Kind | Length |Subtype| Flags | Experiment |
+---------------+---------------+-------+-------+---------------+
| Id. (16 bits) | Maximum RTO (milliseconds) |
+---------------------------------------------------------------+
Figure 2: The UPERF Maximum RTO experimental Multipath TCP option
We do not use the same encoding as [RFC5482] because the encoding for
the USER_TIMEOUT option cannot support maximum RTOs that are smaller
than one second. There are already use cases where users do not
accept to wait such a long time before switching to a backup subflow.
The Experiment Identifier should be TBD and the flags must be used as
defined in [I-D.bonaventure-mptcp-exp-option].
If experiments conducted with this option show positive results, it
could be possible to update the MP_PRIO option to encode the maximum
RTO information as shown in Figure 3.
1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+---------------+---------------+-------+-----+-+--------------+
| Kind | Length |Subtype| |B| AddrID (opt) |
+---------------+---------------+-------+-----+-+--------------+
| Maximum RTO (milliseconds) |
+-----------------------------------------------+
Figure 3: The UPERF Maximum RTO Multipath TCP option
4. Security considerations
This document does not modify the security considerations for
Multipath TCP.
5. IANA considerations
This document proposes the UPERF experimental Multipath TCP option
whose experiment identifier is TBD.
Bonaventure, et al. Expires January 7, 2016 [Page 8]
Internet-Draft MPTCP backup July 2015
If experiments are successful, an update to this document will
propose a new format for the MP_PRIO option defined in [RFC6824].
6. Conclusion
In this document, we have first explained some issues with the
handling of backup subflows by Multipath TCP. Multipath TCP meets
the expectations of its uses when subflows fail completely. In this
case, Multipath TCP moves the traffic over the backup subflows.
However, if the primary subflows underperform, Multipath TCP
implementations may try to retransmit data over such subflows for a
long period of time instead of switching quickly to the backup
subflow. We have then proposed to set an upper bound on the
retransmission timer (RTO) to detect underperforming subflows. This
bound can be set locally of exchanged through the proposed UPERF
Multipath TCP option.
7. Acknowledgements
This work was partially supported by the FP7-Trilogy2 project. We
would like to thank Mohamed Boucadair for his useful suggestions and
comments on this document.
8. References
8.1. Normative References
[RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure,
"TCP Extensions for Multipath Operation with Multiple
Addresses", RFC 6824, January 2013.
8.2. Informative References
[BD2015] Baerts, M. and Q. De Coninck, "Multipath TCP with Real
Smartphone Applications", Master Thesis, UCL , June 2015.
[Cellnet12]
Paasch, C., Detal, G., Duchene, F., Raiciu, C., and O.
Bonaventure, "Exploring Mobile/WiFi Handover with
Multipath TCP", ACM SIGCOMM workshop on Cellular Networks
(Cellnet12) , 2012,
<http://inl.info.ucl.ac.be/publications/
exploring-mobilewifi-handover-multipath-tcp>.
Bonaventure, et al. Expires January 7, 2016 [Page 9]
Internet-Draft MPTCP backup July 2015
[I-D.bonaventure-mptcp-exp-option]
Bonaventure, O., benjamin.hesmans@uclouvain.be, b., and M.
Boucadair, "Experimental Multipath TCP option", draft-
bonaventure-mptcp-exp-option-00 (work in progress), June
2015.
[RFC0793] Postel, J., "Transmission Control Protocol", STD 7, RFC
793, September 1981.
[RFC1122] Braden, R., "Requirements for Internet Hosts -
Communication Layers", STD 3, RFC 1122, October 1989.
[RFC5461] Gont, F., "TCP's Reaction to Soft Errors", RFC 5461,
February 2009.
[RFC5482] Eggert, L. and F. Gont, "TCP User Timeout Option", RFC
5482, March 2009.
[RFC5533] Nordmark, E. and M. Bagnulo, "Shim6: Level 3 Multihoming
Shim Protocol for IPv6", RFC 5533, June 2009.
[RFC5534] Arkko, J. and I. van Beijnum, "Failure Detection and
Locator Pair Exploration Protocol for IPv6 Multihoming",
RFC 5534, June 2009.
[RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection
(BFD)", RFC 5880, June 2010.
[RFC5883] Katz, D. and D. Ward, "Bidirectional Forwarding Detection
(BFD) for Multihop Paths", RFC 5883, June 2010.
[RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent,
"Computing TCP's Retransmission Timer", RFC 6298, June
2011.
Authors' Addresses
Olivier Bonaventure
UCLouvain
Email: Olivier.Bonaventure@uclouvain.be
Quentin De Coninck
UCLouvain
Email: Quentin.Deconinck@student.uclouvain.be
Bonaventure, et al. Expires January 7, 2016 [Page 10]
Internet-Draft MPTCP backup July 2015
Matthieu Baerts
UCLouvain
Email: Matthieu.Baerts@student.uclouvain.be
Fabien Duchene
UCLouvain
Email: Fabien.Duchene@uclouvain.be
Benjamin Hesmans
UCLouvain
Email: Benjamin.Hesmans@uclouvain.be
Bonaventure, et al. Expires January 7, 2016 [Page 11]