Network Working Group F. Templin, Ed. Internet-Draft Boeing Phantom Works Intended status: Informational September 29, 2007 Expires: April 1, 2008 Trivial Packetization Layer Path MTU Discovery for IP/*/IP Tunnels draft-templin-inetmtu-trivial-01.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on April 1, 2008. Copyright Notice Copyright (C) The IETF Trust (2007). Abstract The nominal Maximum Transmission Unit (MTU) of the Internet has become 1500 bytes, but existing IP/*/IP tunneling mechanisms impose an encapsulation overhead that can reduce the effective path MTU to smaller values. Additionally, existing tunneling mechanisms are limited in their ability to discover and utilize larger MTUs. This document specifies a trivial mechanism for tunnel MTU determination that addresses these issues. Templin Expires April 1, 2008 [Page 1] Internet-Draft T/PLPMTUD for Tunnels September 2007 1. Introduction The nominal Maximum Transmission Unit (MTU) of today's Internet has become 1500 bytes due to the preponderance of networking gear that configures an MTU of that size. Since not all links in the Internet configure a 1500 byte MTU, however, packets can be dropped due to an MTU restriction on the path. Upper layers see IP/*/IP tunnels as ordinary links, but even for packets no larger than 1500 bytes these links are susceptible to silent loss (e.g., due to path MTU restrictions, lost error messages, layered encapsulations, reassembly buffer limitations, etc.) resulting in poor performance and/or communications failures [RFC2923][RFC4459][RFC4821][RFC4963]. This document specifies a trivial mechanism for IP/*/IP tunnel MTU determination. It updates the functional specifications for Tunnel Endpoints (TEs) found in existing tunneling mechanisms (see: Section 9). 2. Terminology The following abbreviations and terms are used in this document: IPv4 - Internet Protocol, Version 4 [RFC0791] IPv6 - Internet Protocol, Version 6 [RFC2460] IP - Internet Protocol, either Version 4 or Version 6 DF - the IPv4 header "Don't Fragment" flag ENCAPS - the size of the encapsulating */IP headers plus trailers MaxPktSize - Maximum Outer Packet size, in bytes MTU - Maximum Transmission Unit PTB - Packet Too Big error (either ICMPv4 [RFC1191] or ICMPv6 [RFC1981]) TE - Tunnel Endpoint TFE - Tunnel Far End TNE - Tunnel Near End Templin Expires April 1, 2008 [Page 2] Internet-Draft T/PLPMTUD for Tunnels September 2007 IP/*/IP - an IP packet encapsulated in */IP headers (e.g. for "*" = NULL, UDP, TCP, AH, ESP, etc.) inner packet/header/payload - an IP packet/header/payload before IP/*/IP encapsulation. outer packet/header/payload - a */IP packet/header/payload after IP/*/IP encapsulation. The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this document, are to be interpreted as described in [RFC2119]. 3. Concept of Operation TEs that implement this scheme engage in a continuous probing process while data is flowing through the tunnel to determine the maximum outer packet size ('MaxPktSize'). When the flow of data through the tunnel is suspended, the probing process is discontinued. Only the TNE need implement the scheme, therefore deployment can occur independently of deployment on the TFE. 4. MTU TEs configure an indefinite MTU on the tunnel interface, i.e., there is no logical limit on the size of inner packets that upper layers can present to the tunnel interface. 5. Tunnel Soft State TEs maintain the following conceptual variable as soft state: MaxPktSize the current maximum length outer packet/fragment that is known to traverse the tunnel. (See: [RFC3819], Section 2 for subnetwork MTU recommendations that influence 'MaxPktSize'.) Recommended default value: 128 bytes for */IPv4 tunnels, 1280 bytes for */IPv6 tunnels. 6. Sending Packets TEs send packets into the tunnel according to the following specification. Note that this specification expects that original Templin Expires April 1, 2008 [Page 3] Internet-Draft T/PLPMTUD for Tunnels September 2007 sources that send unfragmentable packets larger than 1500 bytes will use end-to-end MTU determination per [RFC4821]. 6.1. Conceptual Sending Algorithm With reference to Sections 6.2 - 6.4, TEs use the following conceptual sending algorithm: if inner packet is larger than ('MaxPktSize' - ENCAPS) and inner packet is not a probe if inner packet is not fragmentable (see: Section 6.2) if inner packet is larger than 1500 bytes Encapsulate as an outer packet (see: Section 6.3) and set DF=1 in the outer IPv4 header (for */IPv4 tunnels - see Section 6.4). Send Packet and return. else Send PTB appropriate to the inner protocol with MTU = ('MaxPktSize' - ENCAPS). Drop packet and return. endif else Fragment inner packet into fragments no larger than ('MaxPktSize' - ENCAPS) (see: Section 6.2). endif endif foreach inner packet/fragment Encapsulate as an outer packet (see: Section 6.3) and set DF=1 in the outer IPv4 (for */IPv4 tunnels - see: Section 6.4). Send packet/fragment. endforeach Figure 1: Conceptual Sending Algorithm 6.2. Inner Packet Fragmentation An inner packet is "fragmentable" IFF the TE is permitted to break it into inner fragments before encapsulation, e.g., an IPv4 packet with DF=0, an IPv6 packet with a fragment header, etc. Per the conceptual sending algorithm, the TE uses the IP fragmentation mechanism of the inner protocol to fragment inner packets into fragments no larger than ('MaxPktSize' - ENCAPS). Note that for IPv6/*/IPv4 tunnels, ('MaxPktSize' - ENCAPS) may not be large enough to accommodate the minimum IPv6 MTU such that the TE may be required to drop an inner IPv6 packet of 1280 bytes or smaller and return an ICMPv6 PTB with an MTU value less than 1280 bytes. The Templin Expires April 1, 2008 [Page 4] Internet-Draft T/PLPMTUD for Tunnels September 2007 original IPv6 source will then include a fragment header in subsequent IPv6 packets and the TE can then perform IPv6 fragmentation on these inner packets using the fragment header included by the source per [RFC2460], Section 5. 6.3. Encapsulation TEs encapsulate inner IP packets according to the specific IP/*/IP tunneling specification. During encapsulation the TE also appends a trailer consisting of zero or more padding bytes, e.g., to probe the tunnel path MTU. The TE increments the innermost '*' header length field by the number of trailer bytes added, for example it increments the UDP length field for IPv6/UDP/IPv4 tunnels, the IPv4 length field for IPv6/IPv4 tunnels, the outer IPv6 length field for IPv6/IPv6 tunnels, etc. The encapsulation is shown in Figure 2: +---------------------------------+ | Outer IP Header | | | +---------------------------------+ | * Headers | | | +-------------+ +---------------------------------+ | Inner IP | | Inner IP | ~ packet ~ ===> ~ packet ~ | | | | +-------------+ +---------------------------------+ Inner Packet | | ~ Trailer (0 or more bytes) ~ | | +---------------------------------+ | Any */IP protocol trailers ... +------------------------------ Outer Packet Figure 2: Encapsulation Format with Trailer Note that decapsulation at the TFE automatically erases the trailer. 6.4. Setting DF in the Outer IPv4 Header For IP/*/IPv4 tunnels, TEs set DF=1 in the outer IPv4 header of all tunneled packets. 7. Receiving Packet Too Big (PTB) Errors TEs may receive PTB errors in response to any outer packets that were Templin Expires April 1, 2008 [Page 5] Internet-Draft T/PLPMTUD for Tunnels September 2007 admitted into the tunnel. When the TE receives a PTB with an MTU value smaller than 'MaxPktSize', it SHOULD reduce 'MaxPktSize' and/or actively probe to discover and confirm a new 'MaxPktSize'. It SHOULD also send a translated PTB back to the inner source if possible. Further strategies for handling PTB errors are specified in [RFC4821], Section 7. 8. Path MTU Probing TEs engage in a probing process while data is actively flowing through the tunnel to determine 'MaxPktSize'. The TE SHOULD send progressively larger probes to determine the largest possible 'MaxPktSize' value early in the probing process and SHOULD continue probing periodically thereafter. When the TE receives sufficient evidence through probing that the forward path supports the probed size, it advances 'MaxPktSize' to the probed size. Additional probing considerations are specified in [RFC4821], Section 7. TEs generate probe requests by creating a minimum-sized and unfragmentable echo request packet according to the inner IP protocol (e.g., an ICMPv6 [RFC4443] echo request when the inner protocol is IPv6). The echo request MUST include a source address that corresponds to the TNE and a destination address that corresponds to the TFE. The echo request SHOULD also include identifying information (e.g., sequence/identification numbers, nonce values, etc.) that will be echoed in the reply. The TE then encapsulates the echo request with trailing padding added to create an outer probe request of the desired probe size and sends it into the tunnel as specified in Section 6. The TE SHOULD probe frequently enough to refresh 'MaxPktSize' but SHOULD limit the rate at which it sends probe requests. The TE SHOULD retain a cache of recently-sent probe requests to verify subsequent probe replies. When a TE receives a potential probe reply, it first determines that the inner IP echo reply matches one of its echo requests by examining the inner IP source and destination addresses as well as any other identifying information in the inner packet. If the echo reply matches one of the TE's echo requests, it uses the probe reply for 'MaxPktSize' determination. TEs discontinue the probing process and garbage-collect stale soft state for dormant tunnels. Templin Expires April 1, 2008 [Page 6] Internet-Draft T/PLPMTUD for Tunnels September 2007 9. Updated Specifications This document updates the following specifications: o RFC2003 (IP-in-IP) o RFC2473 (Generic packet tunneling in IPv6) o RFC2529 (6over4) o RFC2661 (L2TP) o RFC2784 (GRE) o RFC3056 (6to4) o RFC3378 (ETHERIP) o RFC3884 (IPSec Transport Mode for Dynamic Routing) o RFC4023 (MPLS-in-IP) o RFC4213 (Basic IPv6 Transition Mechanisms) o RFC4214 (ISATAP) o RFC4301 (IPSec) o RFC4302 (AH) o RFC4303 (ESP) o RFC4380 (TEREDO) o LISP o others.... 10. IANA Considerations There are no IANA considerations for this document. 11. Security Considerations A possible attack vector involves an off-path attacker sending probe requests with spoofed source addresses. Legitimate probe requests Templin Expires April 1, 2008 [Page 7] Internet-Draft T/PLPMTUD for Tunnels September 2007 and replies contain identifying information that is useful for defending against off-path attacks. Security considerations for specific IP/*/IP tunneling mechanisms are given in the respective documents. 12. Acknowledgments This work has benefited from discussions with Fred Baker, Iljitsch van Beijnum, Brian Carpenter, Steve Casner, Remi Denis-Courmont, Aurnaud Ebalard, Gorry Fairhurst, John Heffner, Christian Huitema, Joe Macker, Matt Mathis, Joe Touch and James Woodyatt. 13. References 13.1. Normative References [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981. [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, November 1990. [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery for IP version 6", RFC 1981, August 1996. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998. 13.2. Informative References [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", RFC 2923, September 2000. [RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D., Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. Wood, "Advice for Internet Subnetwork Designers", BCP 89, RFC 3819, July 2004. [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification", RFC 4443, March 2006. Templin Expires April 1, 2008 [Page 8] Internet-Draft T/PLPMTUD for Tunnels September 2007 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- Network Tunneling", RFC 4459, April 2006. [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU Discovery", RFC 4821, March 2007. [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly Errors at High Data Rates", RFC 4963, July 2007. Author's Address Fred L. Templin (editor) Boeing Phantom Works P.O. Box 3707 Seattle, WA 98124 USA Email: fred.l.templin@boeing.com Templin Expires April 1, 2008 [Page 9] Internet-Draft T/PLPMTUD for Tunnels September 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Templin Expires April 1, 2008 [Page 10]