Behavior Engineering for Hindrance                        I. van Beijnum
Avoidance                                                 IMDEA Networks
Internet-Draft                                              May 21, 2009
Expires: November 22, 2009


              IPv6-to-IPv4 translation fragmentation issue
                   draft-van-beijnum-behave-frag64-00

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on November 22, 2009.

Copyright Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents in effect on the date of
   publication of this document (http://trustee.ietf.org/license-info).
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.

Abstract

   This draft outlines a way to handle IPv4 MTUs smaller than 1280 bytes
   and a way to handle the identification field in fragmentation that
   are different from the ones specified in [RFC2460] for the purposes


van Beijnum             Expires November 22, 2009               [Page 1]

Internet-Draft             NAT64 fragmentation                  May 2009


   of discussion in the BEHAVE working group.


1.  Introduction

   [RFC2460] specifies that links carrying IPv6 packets must support an
   MTU of at least 1280 bytes and that IPv6 hosts which don't implement
   path MTU discovery can leave this feature out if they limit
   themselves to sending packets no larger than 1280 bytes.  It also
   contains the following text:

      In response to an IPv6 packet that is sent to an IPv4 destination
      (i.e., a packet that undergoes translation from IPv6 to IPv4), the
      originating IPv6 node may receive an ICMP Packet Too Big message
      reporting a Next-Hop MTU less than 1280.  In that case, the IPv6
      node is not required to reduce the size of subsequent packets to
      less than 1280, but must include a Fragment header in those
      packets so that the IPv6-to-IPv4 translating router can obtain a
      suitable Identification value to use in resulting IPv4 fragments.
      Note that this means the payload may have to be reduced to 1232
      octets (1280 minus 40 for the IPv6 header and 8 for the Fragment
      header), and smaller still if additional extension headers are
      used.

   This suggests that an IPv6-IPv4 translator can simply translate ICMP
   "too big" messages it receives from IPv4 hosts to a corresponding
   ICMPv6 "too big" message directed at the IPv6 host originating the
   packet towards the IPv4 host that sent the original "too big"
   message.

   Should the MTU contained in the "too big" message be smaller than
   1280 bytes, the IPv6 host will start including a fragment header,
   which tells the translator to set the DF bit to 0 after translation
   and provides the translator with an identification value that it can
   include in the IPv4 packet (in truncated form).

   [RFC2765] specifies setting DF to 1 in all other translated packets
   to allow for path MTU discovery.


2.  Issues

   With stateful IPv6-to-IPv4 translation, using 16 bits from the
   identification in the IPv6 fragment header is suboptimal because the
   field must be unique within the time period that packets may be
   present in the network for a given protocol, source address,
   destination address tuple.  Because multiple IPv6 hosts will generate
   packets that share the same IPv4 source address, using the


van Beijnum             Expires November 22, 2009               [Page 2]

Internet-Draft             NAT64 fragmentation                  May 2009


   identification from the IPv6 host doesn't provide uniqueness.

   Should the translator use the presence of a fragmentation header as
   the factor that determines whether to set DF to 0 or to 1, the
   situation may arise where an IPv6 host doesn't perform path MTU
   discovery, and thus doesn't react to "too big" messages.  For normal
   IPv6 operation, this would be legitimate way to operate if the host
   sends packets no bigger than 1280 bytes.  However, if the host
   doesn't get the "too big" messages or doesn't react to them by
   including the fragment header, it will send 1280-byte packets without
   the fragment header which the translator will translate into 1260-
   byte packets with DF set to 1.  An IPv4 host with a path MTU smaller
   than 1260 will not be able to receive those packets.  This creates a
   path MTU discovery black hole that wouldn't exist in the pure IPv4 or
   pure IPv6 cases.

   Minor issues: the fragment header creates extra overhead and
   including the fragment header is an otherwise unused code path that
   may not exist or be buggy on implementations that haven't been
   extensively tested.


3.  Identification

   When a session consistently creates fragmented packets, and a
   fragment is lost, then eventually another fragment with the same
   identification value will arrive.  If by that time, the receiver is
   still waiting to reassemble the first packet, it will reassemble a
   packet from fragments belonging to different packets.  When this
   happens, there is an approximately 1 in 65535 chance that the TCP or
   UDP checksum will not catch this situation and undetected data
   corruption occurs.  This situation can be avoided by not reusing
   identification values within the reassembly timeout, or by having a
   very low loss probability.

   In the stateful translation case, the situation where the translator
   assigns identification values with the highest achievable level of
   uniqueness is clearly preferable over copying the lower 16 bits of
   the IPv6 host's identification value, because this reduces the
   chances of identification value clashes within a short time.

   Ideally, a translator would maintain an identification counter for
   each protocol, source, destination address tuple so that
   identification values are only reused within the lifetime of packets
   when the number of packets per second becomes so large that this is
   unavoidable.  However, for reasons of implementation complexity, it
   may be necessary to have fewer identification counters.  Sharing
   identification counters across different protocol, source,


van Beijnum             Expires November 22, 2009               [Page 3]

Internet-Draft             NAT64 fragmentation                  May 2009


   destination tuples means that there is a likelihood that the same
   identification value is reused for the same tuple sooner, but there
   is still a good statistical likelihood that this won't happen.

   Stateless translators could use the same strategy as stateless
   translators, but they may also use the strategy outlined in
   [RFC2460].

   In theory, IPv4 packets with DF set to 1 don't need a unique
   identification value.  However, it is not unheard of for operators to
   configure equipment to clear the DF bit, at which time an
   identification value with good uniqueness becomes necessary.  As
   such, it is recommended that translators include a unique
   identification value in all packets, including those with DF set to
   1.  However, since more packets will be sent with DF set to 1, this
   will use up identification values faster.  Implementations may choose
   to segment the identification space and assign values from non-
   overlapping pools to packets with DF set to 0 and DF set to 1 to
   provide a longer period of uniqueness to fragmentable packets.


4.  Choices

   Considering the above, there are the following choices.  For a
   stateless translator:

   1.  Use the [RFC2460] behavior, with DF set to 0 when a fragment
       header is present and DF set to 1 otherwise.  Take the
       identification value from the IPv6 fragment header.

   2.  Use the [RFC2460] behavior, but set DF to 0 for packets equal to
       or smaller than 1280 bytes and set DF to 1 for packets larger
       than 1280 bytes.  Take the identification value from the IPv6
       fragment header if present, otherwise generate in the translator.
       This avoids the introduction of the path MTU discovery black hole
       but still conforms to [RFC2460].  However, there is a slight risk
       of overlapping identification values between ones from the
       fragment header and locally generated ones.

   3.  Rewrite translated "too big" messages in the IPv4-to-IPv6
       direction to an MTU of 1280 if the indicated MTU is smaller than
       1280, set DF to 0 for packets equal to or smaller than 1280 bytes
       and set DF to 1 for packets larger than 1280 bytes.  Generate
       identification values locally.  This avoids the black hole and
       the extra fragment header overhead and minimzes identification
       clashing issues.

   For a stateful translator:


van Beijnum             Expires November 22, 2009               [Page 4]

Internet-Draft             NAT64 fragmentation                  May 2009


   1.  Use the [RFC2460] / [RFC2765] behavior, with DF set to 0 when a
       fragment header is present and DF set to 1 otherwise.  Take the
       identification value from the IPv6 fragment header.  This has a
       high risk of identification value clashes.

   2.  Use the [RFC2460] behavior, but set DF to 0 for packets equal to
       or smaller than 1280 bytes and set DF to 1 for packets larger
       than 1280 bytes.  Take the identification value from the IPv6
       fragment header if present, otherwise generate in the translator.
       This avoids the introduction of the path MTU discovery black hole
       but still conforms to [RFC2460].  This has a high risk of
       identification value clashes.

   3.  Rewrite translated "too big" messages in the IPv4-to-IPv6
       direction to an MTU of 1280 if the indicated MTU is smaller than
       1280, set DF to 0 for packets equal to or smaller than 1280 bytes
       and set DF to 1 for packets larger than 1280 bytes.  Generate
       identification values locally.  This avoids the black hole and
       the extra fragment header overhead and minimzes identification
       clashing issues.


5.  References

   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
              (IPv6) Specification", RFC 2460, December 1998.

   [RFC2765]  Nordmark, E., "Stateless IP/ICMP Translation Algorithm
              (SIIT)", RFC 2765, February 2000.


Author's Address

   Iljitsch van Beijnum
   IMDEA Networks
   Avda. del Mar Mediterraneo, 22
   Leganes, Madrid  28918
   Spain

   Email: iljitsch@muada.com


van Beijnum             Expires November 22, 2009               [Page 5]