Internet DRAFT - draft-massar-v6man-mtu-label

draft-massar-v6man-mtu-label







IPv6 Maintenance                                               J. Massar
Internet-Draft                                         Massar Networking
Updates: 6437 (if approved)                            November 13, 2014
Intended status: Standards Track
Expires: May 17, 2015


                           The IPv6 MTU Label
                    draft-massar-v6man-mtu-label-02

Abstract

   This document redefines the use of the IPv6 Flow Label field to allow
   specification of the lowest MTU on a path that the packet travels,
   thus in most cases avoiding the need for performing Path MTU
   Discovery and the round-trip penalty that that occurs for processing
   the ICMPv6 PTB and retransmitting the packets involved.

   This specification allows graceful decrease of MTU so that large and
   non-standard MTU sizes can safely be used on the Internet.

   [[A1: Obsoleting the IPv6 Flow Label and replacing it completely with
   this field might be a better option than keeping it defined as an
   IPv6 Flow Label field and allowing existing flows to exist.  --JM]]
   [[A2: A better name than "IPv6 MTU Label" is requested as it is not
   really a "Label".  --JM]]

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on May 17, 2015.








Massar                    Expires May 17, 2015                  [Page 1]

Internet-Draft             The IPv6 MTU Label              November 2014


Copyright Notice

   Copyright (c) 2014 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   3
   3.  IPv6 MTU Label Format . . . . . . . . . . . . . . . . . . . .   4
   4.  Node Requirements . . . . . . . . . . . . . . . . . . . . . .   5
   5.  Updating the MTU Label  . . . . . . . . . . . . . . . . . . .   6
   6.  Maximum size of packets . . . . . . . . . . . . . . . . . . .   9
   7.  Handling network changes  . . . . . . . . . . . . . . . . . .   9
   8.  MTU Rediscovery . . . . . . . . . . . . . . . . . . . . . . .  10
   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  10
     9.1.  Spoofing the MTU Label  . . . . . . . . . . . . . . . . .  10
     9.2.  Firewall treatment of the MTU Label . . . . . . . . . . .  10
   10. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  11
   11. References  . . . . . . . . . . . . . . . . . . . . . . . . .  11

1.  Introduction

   When a packet that is being sent does not fit the MTU of the next
   link the IPv6 protocol specifies that an ICMPv6 Packet Too Big (PTB)
   [RFC4443] error must be sent back to the originator of the packet.
   The original sending host then receives this packet and based on the
   MTU provided resends that packet but then fitting the MTU indicated
   in the ICMPv6 PTB.  This process is called Path MTU Discovery
   [RFC1191].

   Unfortunately there are broken networks that filter ICMPv6
   altogether, even though this is against the IPv6 specification.  This
   breaks especially TCP [RFC0793] as those packets go missing
   altogether and then a timer has to fire till the packet is sent
   again, but as that packet is also too large to fit the link it won't
   arrive at the destination either and thus such a connection becomes
   stuck and times out.



Massar                    Expires May 17, 2015                  [Page 2]

Internet-Draft             The IPv6 MTU Label              November 2014


   In addition for various load-balancing implementations it is
   apparently a heavy task to correlate incoming ICMPV6 PTBs to the
   original packets (which are for most part included in the ICMPv6 PTB)
   and then deliver it to the backend node that originally transmitted
   that packet so that that node can retransmit it again in smaller
   chunks.

   The extra round-trip of the ICMPv6 PTB and the need for having the
   load-balancer needing to figure out where to forward the packet is a
   problem for large hosting providers who want to minimize latency and
   maximize throughput [I-D.v6ops-pmtud-ecmp-problem].

   This specificiation mitigates these problems, in part, by allowing
   routers to include the lowest common MTU on the path in the IPv6
   packet's Flow Label field.

   Alternative techniques to solve this problem are described in:
   Packetization Layer Path MTU Discovery [RFC4821].  This method
   requires extra packets, so called "probes", to be sent or for there
   to be space left in the packet for including extra information for
   transfering the MTU details.  By using the former IPv6 Flow Label we
   avoid these requirements.

2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   Fields and numbers specified in this document are in network byte
   order (Big Endian).

   This section lists a few terms specifically as they might easily be
   confused for each other.  [[T1: Clean these up and add more terms
   that might be confusing --JM]]

   IPv6 Payload Length  The maximum length of the payload (data)
      included in an IPv6 packet (excluding IPv6 header size).

   Maximum Transmission Unit (MTU)  The maximum length of a full packet
      (including IPv6 header size).

   Ingress Interface  Network Interface where a packet is received.

   Egress Interface  Network Interface where a packet is sent out.

   Ingress MTU  The MTU of the ingress interface.




Massar                    Expires May 17, 2015                  [Page 3]

Internet-Draft             The IPv6 MTU Label              November 2014


   Egress MTU  The MTU of the egress interface.

   Node  A device that implements IP.

   Path  The set of links traversed by a packet between a source node
      and a destination node.

   Path MTU, or PMTU  The minimum link MTU of all the links in a path
      between a source node and a destination node.

   PTB (Packet Too Big) message  An ICMP message reporting that an IP
      packet is too large to forward.

   MSS  The TCP Maximum Segment Size [RFC6691], the maximum payload size
      available to the TCP layer.  This is typically the Path MTU minus
      the size of the IP and TCP headers.

3.  IPv6 MTU Label Format

   The IPv6 Flow Label field consists of 20 bits [RFC6437].

   The IPv6 Flow Label Field in the IPv6 Packet Header.

    0                   1
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Hosts not supporting this specificiation will treat the IPv6 Flow
   Label as an opaque number as defined in [RFC6437].

   When the first four bits of the IPv6 Flow Label field are set to 1,
   the Flow Label Field is considered to be a "IPv6 MTU Label" or for
   short "MTU Label".

   Format of the Flow Label field in "IPv6 MTU Label" mode.

    0                   1
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1 1 1 1|  Maximum Transmission Unit    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   The Flow Label ID is 0xf0000.

   This specification allows an MTU in the ranage of 1280 - 65534 in
   this header field.



Massar                    Expires May 17, 2015                  [Page 4]

Internet-Draft             The IPv6 MTU Label              November 2014


   When the MTU is less than 1280, it is considered invalid due to IPv6
   minimum MTU requirement.

   An MTU of 65535 indicates that IPv6 Jumbograms are in use.  Automatic
   MTU discovery does not work for these.  The MTU has to be configured
   properly on all nodes by the operator.  A future document might
   specify an Extension Header Option that contains the JumboGram MTU
   size when the MTU is set to 65535.

   Following are a few examples of common MTU Labels.

   MTU Label with an MTU of 1280 (0x0500)

    0                   1
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1 1 1 1|0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   The Flow Label ID is 0xf0500.

   MTU Label with an MTU of 1500 (0x05dc)

    0                   1
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1 1 1 1|0 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   The Flow Label ID is 0xf05dc.

   MTU Label with an MTU of 9000 (0x2328)

    0                   1
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1 1 1 1|0 0 1 0 0 0 1 1 0 0 1 0 1 0 0 0|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   The Flow Label ID is 0xf2328.

4.  Node Requirements

   Every node on the path, including source, intermediate routers and
   destination performs the following.

   If no MTU Label is present (first 4 bits not '1'), and the label is
   0, the node MUST replace the label with MTU Label with a MTU value



Massar                    Expires May 17, 2015                  [Page 5]

Internet-Draft             The IPv6 MTU Label              November 2014


   determined by the lower of the ingress and egress MTU of the
   interfaces involved in forwarding the packet.  This allows backwards
   compatibility for nodes that do fill in the Flow Label field by not
   disturbing it.  [[N1: See A1 - when obsoleting the IPv6 Flow Label we
   force replacing the field always --JM]]

   Each node MUST verify that the given MTU Label is valid (>= 1280).
   When an MTU Label in the range of 0 - 1279 is encountered it MUST be
   considered invalid and overwritten by the router to the knowledge it
   has of the proper MTU.  [[N2: Or we could send an ICMPv6 Parameter
   problem, but this allows a form of backward compatibility when the
   first four bits are all set --JM]]

   Each hop, including the source, that is transmitting or forwarding
   the packet MUST update the MTU Label to be the correct lowest MTU for
   that path as it has knowledge of.  This includes the MTU of the
   ingress and egress interface and learning about the MTU from a
   different path between the same two points or by having another
   protocol (e.g.  TCP) providing these details.  MSS clamping [RFC6691]
   thus can affect the MTU Label if an implementation has this
   knowledge.

   The first packet being sent in each direction of a path MUST have a
   maximum size of 1280, while setting the MTU Label to the MTU of the
   egress interface.

   A destination node MUST use the MTU found in the MTU Label for
   packets send subsequently to the source, including the MTU in the MTU
   Label.  This allows the source host to learn the MTU of the full
   path.

   Network equipment SHOULD have a configuration option to force
   overwriting of non-MTU Label Flow Labels, this to force network
   equipment to handle the MTU Label.  This might cause issues when Flow
   Labels are actually used, hence is not the default.

5.  Updating the MTU Label

   Given below is an example network along with a table of actions per
   node that illustrates how a label is updated and how the knowledge
   learnt is used.










Massar                    Expires May 17, 2015                  [Page 6]

Internet-Draft             The IPv6 MTU Label              November 2014


   A typical asymmetric network as found on the Internet

               +------------+
               |   Host A   |
               +------------+
                      |
                      | MTU=1500
                      |
               +-------------+
               |  Router 1   |
               +-------------+
                 |         ^
       MTU=1500  |         | MTU=1500
                 v         |
    +--------------+     +--------------+
    |   Router 2   |     |   Router 7   |
    +--------------+     +--------------+
           |                    ^
           |                    |
           | MTU=1480           | MTU=1280
           |                    |
           |             +--------------+
           |             |   Router 6   |
           |             +--------------+
           |                    ^
           |                    | MTU=1500
           v                    |
    +--------------+     +--------------+
    |   Router 3   |     |   Router 5   |
    +--------------+     +--------------+
                 |         ^
        MTU=1500 |         | MTU=1500
                 v         |
               +-------------+
               |  Router 4   |
               +-------------+
                      |
                      | MTU=9000
                      |
               +-------------+
               |    Host B   |
               +-------------+

   In this example network the routing protocols involved cause an
   asymmetric routing of packets.






Massar                    Expires May 17, 2015                  [Page 7]

Internet-Draft             The IPv6 MTU Label              November 2014


   When Host A sends a packet to Host B, the path is: HA, R1, R2, R3,
   R4, HB.  The return path for a packet from Host B to Host A is: HB,
   R4, R5, R6, R7, R1, HA.

   Given that network the following decisions are made.

   +------+------------------------+-----------+-----------+-----------+
   | Node | Decision               | Incoming  | Outgoing  | MTU Label |
   |      |                        | Link MTU  | Link MTU  | Change    |
   +------+------------------------+-----------+-----------+-----------+
   | HA   | No knowledge, thus use | -         | 1500      | 1500      |
   |      | outgoing link-MTU      |           |           |           |
   | R1   | Outbound not lower,    | 1500      | 1500      | "         |
   |      | don't update           |           |           |           |
   | R2   | Outbound is lower,     | 1500      | 1480      | 1480      |
   |      | update (possible PTB)  |           |           |           |
   | R3   | Outbound is higher,    | 1480      | 1500      | "         |
   |      | don't update           |           |           |           |
   | R4   | Outbound is higher,    | 1500      | 9000      | "         |
   |      | don't update           |           |           |           |
   | HB   | Remember A to B = 1480 | 9000      | -         | "         |
   |      | Reply packet:          |           |           |           |
   | HB   | Learnt A-B = 1480,     | -         | 9000      | 1480      |
   |      | lower than link-MTU,   |           |           |           |
   |      | use it                 |           |           |           |
   | R4   | Outbound is higher,    | 9000      | 1500      | "         |
   |      | don't update           |           |           |           |
   | R5   | Outbound is higher,    | 1500      | 1500      | "         |
   |      | don't update           |           |           |           |
   | R6   | Outbound MTU is lower, | 1500      | 1280      | 1280      |
   |      | update it (possible    |           |           |           |
   |      | PTB)                   |           |           |           |
   | R7   | Outbound is higher,    | 1280      | 1500      | "         |
   |      | don't update           |           |           |           |
   | R1   | Outbound is higher,    | 1500      | 1500      | "         |
   |      | don't update           |           |           |           |
   | HA   | Learnt B-A is 1280     | 1500      | 1500      | "         |
   +------+------------------------+-----------+-----------+-----------+

                       Table 1: How MTU gets updated

   In this situation, there would have been two possible locations where
   a PTB is sent (R2->R3 + R6->R7).  But as the first packet sent in a
   direction MUST be sized at a maximum of 1280, no PTB is possible.
   After this first packet has been passed the real MTU is learned and
   this, possibly higher MTU can be used.





Massar                    Expires May 17, 2015                  [Page 8]

Internet-Draft             The IPv6 MTU Label              November 2014


   This does demonstrate that even with this extra information, it might
   not always be perfect to avoid PMTU blackholes.  Nor does the MTU
   Label avoid the need to handle PTB or retransmitting packets in a
   smaller way.  Hence, sending, receiving, forwarding and handling
   ICMPv6 remains important and MUST not be filtered.

   Note that in the above list does not mention any of the standard
   functions and checks like updating the Hop Limit that a router is
   supposed to do as per the IPv6 protocol.

6.  Maximum size of packets

   To facilitate learning the MTU on the complete path at minimum 3
   packets need to be sent between the same source and destination host.
   With the 3rd and subsequent packet will have the correct info

   +------+--------------+----------------+----------------------------+
   | Step | Maximum      | Description    | Destination Learns         |
   |      | Packet Size  |                |                            |
   +------+--------------+----------------+----------------------------+
   | 1    | 1280         | Packet sent    | B learns MTU on path A-B   |
   |      |              | from A to B    |                            |
   | 2    | 1280         | Packet sent    | A learns path A-B-A        |
   |      |              | from B to A    |                            |
   | 3+   | MTU max      | Packet sent    | B learns MTU for full      |
   |      |              | from A to B    | round-trip path A-B-A-B    |
   +------+--------------+----------------+----------------------------+

                       Table 2: Maximum Packet Size

   The 3rd and further packets have knowledge of the MTU of the full
   round-trip path and thus can use this information to send larger
   packets.

   This example assumes sending a single packet after each other in each
   direction.  In the situation where a host sends multiple packets it
   should use the same step as the previous one till it receives a
   return packet.

   Note that thus a TCP handshake (3 packets) is enough to learn the
   correct MTU based on the MTU Label.  In that case a host might decide
   to also use the TCP MSS as additional information.

7.  Handling network changes

   As the process of MTU Labeling happens per packet, new information
   will become available to the host continuously.  When a sending host
   receives information that an MTU is lower than a packet it recently



Massar                    Expires May 17, 2015                  [Page 9]

Internet-Draft             The IPv6 MTU Label              November 2014


   sent it could decide to resend that packet directly to avoid it from
   being blackholed in the upstream.

8.  MTU Rediscovery

   A host can decide that with a long-standing connection a re-probe of
   the MTU is needed.  It can do so by ignoring the cached information
   and sending a packet with a maximum size of 1280 while setting the
   MTU Label to the largest its egress link supports.  This is similar
   to the situation a first packet would be in and thus restarts the
   process at the highest possible MTU.

9.  Security Considerations

9.1.  Spoofing the MTU Label

   An adversary might spoof IP packets from a source to a destination
   with a on-purpose misconfigured MTU Label.  An adversary might also
   perform a man-in-the-middle misconfiguring the MTU Label.

   The effect of doing so though will be minimal as any intermediary
   router will correct the MTU to the value they know is correct based
   on the interfaces the packet flows over.

   The only negative outcome can be that the packet size is reduced to
   the minimum of 1280.  The result being a minor performance impact on
   that path till MTU Rediscovery happens and the MTU is scaled upward
   again.

   Of course networks should employ Network Best Practices and employ
   Anti-spoofing techniques to make this kind of attack impossible.

9.2.  Firewall treatment of the MTU Label

   A firewall might consider the MTU Label untrusted.

   As the firewall knows that the correct value for the MTU Label is
   between 1280, the IPv6 minimum MTU, and it's own link MTU it can act
   like every node that supports the MTU Label and limit the MTU in that
   range.  See Node Requirements for further details.

   A strict firewall where the operator does not want to take any risk
   could even force a MTU of 1280 but causing performance loss.








Massar                    Expires May 17, 2015                 [Page 10]

Internet-Draft             The IPv6 MTU Label              November 2014


10.  Acknowledgements

   Thanks must go to Lorenzi Colliti for bringing
   [I-D.v6ops-pmtud-ecmp-problem] to the attention of the author.

   Many thanks to Brian Carpenter for many insightful comments that
   clarified this specification a lot.

   Matyas Koszik mentioned that the first packet on each link should be
   1280 for the MTU discovery to work in both directions which resulted
   in the "Maximum size of packets" section.

11.  References

   [I-D.v6ops-pmtud-ecmp-problem]
              Byerly, M., Hite, M., and J. Jaeggli, "Close encounters of
              the ICMP type 2 kind (near misses with ICMPv6 PTB)",
              draft-v6ops-pmtud-ecmp-problem-00 (work in progress),
              August 2014.

   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7, RFC
              793, September 1981.

   [RFC1191]  Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
              November 1990.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC4443]  Conta, A., Deering, S., and M. Gupta, "Internet Control
              Message Protocol (ICMPv6) for the Internet Protocol
              Version 6 (IPv6) Specification", RFC 4443, March 2006.

   [RFC4821]  Mathis, M. and J. Heffner, "Packetization Layer Path MTU
              Discovery", RFC 4821, March 2007.

   [RFC6437]  Amante, S., Carpenter, B., Jiang, S., and J. Rajahalme,
              "IPv6 Flow Label Specification", RFC 6437, November 2011.

   [RFC6691]  Borman, D., "TCP Options and Maximum Segment Size (MSS)",
              RFC 6691, July 2012.

Author's Address








Massar                    Expires May 17, 2015                 [Page 11]

Internet-Draft             The IPv6 MTU Label              November 2014


   Jeroen Massar
   Massar Networking
   Swiss Post Box 101811
   Zuercherstrasse 161
   Zuerich  CH-8010
   CH

   EMail: jeroen@massar.ch
   URI:   http://jeroen.massar.ch










































Massar                    Expires May 17, 2015                 [Page 12]