Network Working Group                                    F. Templin, Ed.
Internet-Draft                                      Boeing Phantom Works
Intended status: Informational                        September 29, 2007
Expires: April 1, 2008


   Trivial Packetization Layer Path MTU Discovery for IP/*/IP Tunnels
                  draft-templin-inetmtu-trivial-00.txt

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 1, 2008.

Copyright Notice

   Copyright (C) The IETF Trust (2007).

Abstract

   The nominal Maximum Transmission Unit (MTU) of the Internet has
   become 1500 bytes, but existing IP/*/IP tunneling mechanisms impose
   an encapsulation overhead that can reduce the effective path MTU to
   smaller values.  Additionally, existing tunneling mechanisms are
   limited in their ability to discover and utilize larger MTUs.  This
   document specifies a trivial mechanism for determining maximum packet
   sizes over tunnels that address these issues.




Templin                   Expires April 1, 2008                 [Page 1]

Internet-Draft           T/PLPMTUD  for Tunnels           September 2007


1.  Introduction

   The nominal Maximum Transmission Unit (MTU) of today's Internet has
   become 1500 bytes due to the preponderance of networking gear that
   configures an MTU of that size.  Since not all links in the Internet
   configure a 1500 byte MTU, however [RFC3819], packets can be dropped
   due to an MTU restriction on the path.

   Upper layers see IP/*/IP tunnels as ordinary links, but even for
   packets no larger than 1500 bytes these links are susceptible to
   silent loss (e.g., due to path MTU restrictions, lost error messages,
   layered encapsulations, reassembly buffer limitations, etc.)
   resulting in poor performance and/or communications failures
   [RFC2923][RFC4459][RFC4821][RFC4963].

   This document specifies a trivial mechanism for determining maximum
   packet sizes over IP/*/IP tunnels.  It updates the functional
   specifications for Tunnel Endpoints (TEs) found in existing tunneling
   mechanisms (see: Section 9).


2.  Terminology

   The following abbreviations and terms are used in this document:

      DF - the IPv4 header "Don't Fragment" flag ([RFC0791], Section
      3.1).

      ENCAPS - the size of the encapsulating */IP headers plus trailers.

      IPv4 - Internet Protocol, Version 4 [RFC0791]

      IPv6 - Internet Protocol, Version 6 [RFC2460]

      IP - Internet Protocol, either Version 4 or Version 6

      MaxPktSize - Maximum Outer Packet size, in bytes

      MTU - Maximum Transmission Unit

      PTB - Packet Too Big error (either ICMPv4 [RFC1191] or ICMPv6
      [RFC1981])

      TE - Tunnel Endpoint

      TFE - Tunnel Far End





Templin                   Expires April 1, 2008                 [Page 2]

Internet-Draft           T/PLPMTUD  for Tunnels           September 2007


      TNE - Tunnel Near End

      IP/*/IP - an IP packet encapsulated in */IP headers (e.g. for "*"
      = NULL, UDP, TCP, AH, ESP, etc.)

      inner packet/header/payload - an IP packet/header/payload before
      IP/*/IP encapsulation.

      outer packet/header/payload - a */IP packet/header/payload after
      IP/*/IP encapsulation.

   The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
   SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this
   document, are to be interpreted as described in [RFC2119].


3.  Concept of Operation

   TEs that implement this scheme engage in a continuous probing process
   while data is flowing through the tunnel to determine the maximum
   outer packet size ('MaxPktSize') that can be conveyed.  When the flow
   of data through the tunnel is suspended, the probing process is
   discontinued.  Only the TNE need implement the scheme, therefore
   deployment can occur independently of deployment on the TFE.


4.  MTU

   TEs configure an indefinite MTU on the tunnel interface, i.e., there
   is no logical limit on the size of inner packets that upper layers
   can present to the tunnel interface.


5.  Tunnel Soft State

   TEs maintain the following per-TFE and/or per-flow conceptual
   variable(s) as soft state:

   MaxPktSize
      the current maximum length outer packet/fragment that can be
      accommodated by the path MTU without further fragmentation.  (See:
      [RFC3819], Section 2 for subnetwork MTU recommendations that
      influence 'MaxPktSize'.)

      Recommended default value: 128 bytes for */IPv4 tunnels, 1280
      bytes for */IPv6 tunnels.





Templin                   Expires April 1, 2008                 [Page 3]

Internet-Draft           T/PLPMTUD  for Tunnels           September 2007


6.  Sending Packets

   TEs send packets into the tunnel according to the following
   specifications.

6.1.  Conceptual Sending Algorithm

   With reference to Sections 6.2 - 6.4, TEs use the following
   conceptual sending algorithm:

           if inner packet is larger than ('MaxPktSize' - ENCAPS)
             and inner packet is not a probe
               if inner packet is not fragmentable (see: Section 6.2)
                   Send PTB appropriate to the inner protocol
                   with MTU = ('MaxPktSize' - ENCAPS).
                   Drop packet.
                   Return.
               else
                   Fragment inner packet into fragments no larger
                   than ('MaxPktSize' - ENCAPS) (see: Section 6.2).
               endif
           endif
           foreach inner packet/fragment
               Encapsulate as an outer packet (see: Section 6.3)
               and (for */IPv4 tunnels) set DF=1 in the outer IPv4
               header (see: Section 6.4).
               Send packet/fragment.
           endforeach

                  Figure 1: Conceptual Sending Algorithm

6.2.  Inner Packet Fragmentation

   An inner packet is "fragmentable" IFF the TE is permitted to break it
   into inner fragments before encapsulation, e.g., an IPv4 packet with
   DF=0, an IPv6 packet with a fragment header, etc.  Per the conceptual
   sending algorithm, the TE uses the IP fragmentation mechanism of the
   inner protocol to fragment inner packets into fragments no larger
   than ('MaxPktSize' - ENCAPS).

   Note that for IPv6/*/IPv4 tunnels, ('MaxPktSize' - ENCAPS) may not be
   large enough to accommodate the minimum IPv6 MTU such that the TE may
   be required to drop an inner IPv6 packet of 1280 bytes or smaller and
   return an ICMPv6 PTB with an MTU value less than 1280 bytes.  The
   original IPv6 source will then include a fragment header in
   subsequent IPv6 packets and the TE can then perform IPv6
   fragmentation on these inner packets using the fragment header
   included by the source per [RFC2460], Section 5.



Templin                   Expires April 1, 2008                 [Page 4]

Internet-Draft           T/PLPMTUD  for Tunnels           September 2007


6.3.  Encapsulation

   TEs encapsulate inner IP packets according to the specific IP/*/IP
   tunneling specification.  During encapsulation the TE also appends a
   trailer consisting of zero or more padding bytes, e.g., to probe the
   tunnel path MTU.  The TE increments the innermost '*' header length
   field by the number of trailer bytes added, for example it increments
   the UDP length field for IPv6/UDP/IPv4 tunnels, the IPv4 length field
   for IPv6/IPv4 tunnels, the outer IPv6 length field for IPv6/IPv6
   tunnels, etc.  The encapsulation is shown in Figure 2:

                          +---------------------------------+
                          |          Outer IP Header        |
                          |                                 |
                          +---------------------------------+
                          |            * Headers            |
                          |                                 |
   +-------------+        +---------------------------------+
   |   Inner IP  |        |             Inner IP            |
   ~   packet    ~  ===>  ~             packet              ~
   |             |        |                                 |
   +-------------+        +---------------------------------+
    Inner Packet          |                                 |
                          ~     Trailer (0 or more bytes)   ~
                          |                                 |
                          +---------------------------------+
                          |   Any */IP protocol trailers ...
                          +------------------------------
                                      Outer Packet

                Figure 2: Encapsulation Format with Trailer

6.4.  Setting DF in the Outer IPv4 Header

   TEs set DF=1 in the outer header of all IPv4 packets they send into a
   tunnel.


7.  Receiving Packet Too Big (PTB) Errors

   TEs may receive PTB errors in response to any outer packets that were
   admitted into the tunnel.  When the TE receives a PTB with an MTU
   value smaller than 'MaxPktSize', it SHOULD reduce 'MaxPktSize' and/or
   actively probe to discover and confirm a new 'MaxPktSize'.  It SHOULD
   also send a translated PTB back to the inner source if possible.






Templin                   Expires April 1, 2008                 [Page 5]

Internet-Draft           T/PLPMTUD  for Tunnels           September 2007


8.  Path MTU Probing

   TEs engage in a probing process while data is actively flowing
   through the tunnel to determine 'MaxPktSize'.  The TE SHOULD send
   progressively larger probes to determine the largest possible
   'MaxPktSize' value early in the probing process and SHOULD continue
   probing periodically thereafter.  When the TE receives sufficient
   evidence through probing that the forward path supports the probed
   size, it advances 'MaxPktSize' to the probed size.  The TE MAY send a
   series of probes in parallel to discover the minimum 'MaxPktSize' in
   the case of multipath routes with diverse path MTUs.

   TEs generate probe requests by creating a minimum-sized and
   unfragmentable IP echo request inner packet according to the inner IP
   protocol (e.g., an ICMPv6 [RFC4443] echo request when the inner
   protocol is IPv6).  The echo request MUST include a source address
   that corresponds to the TNE, and SHOULD include identifying
   information (e.g., sequence/identification numbers, nonce values,
   etc.) that will be echoed in the reply.  The destination address of
   the echo request may be either that of the TFE or a node that can
   only be reached through the TFE, e.g., the final destination.  The TE
   then encapsulates the echo request with trailing padding added to
   create an outer probe request of the desired probe size and sends it
   into the tunnel as specified in Section 6.

   The TE SHOULD probe frequently enough to refresh 'MaxPktSize' but
   SHOULD limit the rate at which it sends probe requests.  The TE
   SHOULD retain a cache of recently-sent probe requests to verify
   subsequent probe replies.

   When a TE receives a potential probe reply, it first determines that
   the inner IP echo reply matches one of its echo requests by examining
   the inner IP source and destination addresses as well as any other
   identifying information in the inner packet.  If the echo reply
   matches one of the TE's echo requests, it uses the probe reply for
   'MaxPktSize' determination.

   TEs discontinue the probing process and garbage-collect stale soft
   state for dormant tunnels and/or flows.

   Note that the TE may use probing to determine either an aggregate
   'MaxPktSize' for all flows that use the tunnel or an individual
   'MaxPktSize' for each flow (or some combination thereof).


9.  Updated Specifications

   This document updates the following specifications:



Templin                   Expires April 1, 2008                 [Page 6]

Internet-Draft           T/PLPMTUD  for Tunnels           September 2007


   o  RFC2003 (IP-in-IP)

   o  RFC2473 (Generic packet tunneling in IPv6)

   o  RFC2529 (6over4)

   o  RFC2661 (L2TP)

   o  RFC2784 (GRE)

   o  RFC3056 (6to4)

   o  RFC3378 (ETHERIP)

   o  RFC3884 (IPSec Transport Mode for Dynamic Routing)

   o  RFC4023 (MPLS-in-IP)

   o  RFC4213 (Basic IPv6 Transition Mechanisms)

   o  RFC4214 (ISATAP)

   o  RFC4301 (IPSec)

   o  RFC4302 (AH)

   o  RFC4303 (ESP)

   o  RFC4380 (TEREDO)

   o  LISP

   o  others....


10.  IANA Considerations

   There are no IANA considerations for this document.


11.  Security Considerations

   A possible attack vector involves an off-path attacker sending probe
   requests with spoofed source addresses.  Legitimate probe requests
   and replies contain identifying information that is useful for
   defending against off-path attacks.

   Security considerations for specific IP/*/IP tunneling mechanisms are



Templin                   Expires April 1, 2008                 [Page 7]

Internet-Draft           T/PLPMTUD  for Tunnels           September 2007


   given in the respective documents.


12.  Acknowledgments

   This work has benefited from discussions with Fred Baker, Iljitsch
   van Beijnum, Brian Carpenter, Steve Casner, Remi Denis-Courmont,
   Aurnaud Ebalard, Gorry Fairhurst, John Heffner, Christian Huitema,
   Joe Macker, Matt Mathis, Joe Touch and James Woodyatt.


13.  References

13.1.  Normative References

   [RFC0791]  Postel, J., "Internet Protocol", STD 5, RFC 791,
              September 1981.

   [RFC1191]  Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
              November 1990.

   [RFC1981]  McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery
              for IP version 6", RFC 1981, August 1996.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
              (IPv6) Specification", RFC 2460, December 1998.

13.2.  Informative References

   [RFC2923]  Lahey, K., "TCP Problems with Path MTU Discovery",
              RFC 2923, September 2000.

   [RFC3819]  Karn, P., Bormann, C., Fairhurst, G., Grossman, D.,
              Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L.
              Wood, "Advice for Internet Subnetwork Designers", BCP 89,
              RFC 3819, July 2004.

   [RFC4443]  Conta, A., Deering, S., and M. Gupta, "Internet Control
              Message Protocol (ICMPv6) for the Internet Protocol
              Version 6 (IPv6) Specification", RFC 4443, March 2006.

   [RFC4459]  Savola, P., "MTU and Fragmentation Issues with In-the-
              Network Tunneling", RFC 4459, April 2006.

   [RFC4821]  Mathis, M. and J. Heffner, "Packetization Layer Path MTU



Templin                   Expires April 1, 2008                 [Page 8]

Internet-Draft           T/PLPMTUD  for Tunnels           September 2007


              Discovery", RFC 4821, March 2007.

   [RFC4963]  Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly
              Errors at High Data Rates", RFC 4963, July 2007.


Appendix A.  Discussion

   Probing strategies for packetization layer protocols are specified in
   ([RFC4821], Section 7) and apply also to the TE's 'MaxPktSize'
   probing process.

   Further strategies for handling PTB errors are specified in
   ([RFC4821], Section 7) and apply also to the TE's 'MaxPktSize'
   probing process.

   Note that decapsulation automatically erases any padding that may
   have been inserted by the TE along with the trailing checksum.


Author's Address

   Fred L. Templin (editor)
   Boeing Phantom Works
   P.O. Box 3707
   Seattle, WA  98124
   USA

   Email: fred.l.templin@boeing.com






















Templin                   Expires April 1, 2008                 [Page 9]

Internet-Draft           T/PLPMTUD  for Tunnels           September 2007


Full Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Acknowledgment

   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).





Templin                   Expires April 1, 2008                [Page 10]