Network Working Group V. Kashyap INTERNET DRAFT Sequent Computer Systems Expiration date: 9 August 1998 9 Feb 1998 Modification in Datagram Too Big message Status of this Memo This document is an Internet Draft. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months, and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet Drafts as reference material or to cite them other than as ``work in progress''. To learn the current status of any Internet Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet Drafts shadow directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). This memo provides information for the Internet community. This memo does not specify an Internet standard of any kind. Distribution of this memo is unlimited. Abstract This memo describes a small modification in the 'Datagram Too Big' message for both the IPv4(ICMPv4) and IPv6(ICMPv6) standards. The document addresses possible reduction in resources on large servers when implementing the Path MTU discovery process. Table of Contents 1. Introduction 2. Changes to Datagram Too Big message 2.1 IPv6 Datagram Too Big message 2.2 IPv4 Datagram Too Big message 3. Router specification 4. Impact on host implementation of Path MTU Discovery 5. Security Considerations 6. References 7. Author's Address 1. Introduction When one IP host has a large amount of data to send to another host, the data is transmitted as a series of IP datagrams. It is usually preferable that these datagrams be of the largest size that does not require fragmentation anywhere along the path from the source to the destination. This datagram size is referred to as the Path MTU (PMTU), and it is equal to the minimum of the MTUs of each hop in the path. This shortcoming is overcome by the use of the path MTU discovery process as outlined in [1] and [2]. Datagram Too Big is defined in [1] and [2]. With the current specification of Datagram too Big the source host gets to know that there is a bottleneck in the path somewhere. It cannot aggregate this information to share with other connections (unless they are to the same destination). Thus the source host has to cache the path information on a per host basis. Any representation for the path may be used but in all the current implementations (that I am aware of) of Path MTU discovery the path information is kept as a routing table entry. The receipt of a Datagram Too Big message causes a routing table entry to be created for the destination host. Any reduction in this table size reduces the resources utilized to keep this information and in searching through the large routing table. The suggestion in this document is to attain the following advantages : . Aggregation of paths having the same PMTU . Reduction in resources utilized to store the Path MTU . Speed up in the MTU updates to all connections on a host rather than each discovering it independently. . On deletion of a network route on the host, each of the derived routes/paths has to be deleted too. With the reduction in the number of such caches it would take less time and simpler algorithms can be used. . utilities need to list a smaller set of routes. eg. netstat. . routing protocols need to exchange smaller tables and/or not weed through a large set of derived routes. 2. Changes to Datagram Too Big message 2.1 IPv6 'Datagram too Big' 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ver | Pri | Flow Label | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Source Address | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Destination Address | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type: 2 | mask code | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Maximum Transmission Unit Size | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The mask translates as: prefix mask = -1 << mask code With the addresses being 128 bits long the code required in 8 bits long which fits in the unused ICMP code octet. The route/path for the Path MTU process is identified using the final destination address ANDed to the number of prefix bits. The path/route may also include the flow id [3] but that does not effect this discussion. It is yet another component of the path identification. A mask code of 0 implies all 1s ie. a host route. This is exactly same behaviour as the current definition. A value of 128 implies that the router used its default route. An indication of default route does not provide any information though. It can be considered to be equivalent to the host route case. If the ICMP message is received as a response to a Multicast address the prefix mask information may not be useful. 2.2 IPv4 'Datagram too Big' 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 3 | Code = 4 | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | mask code | Next-Hop MTU | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Internet Header + 64 bits of Original Datagram Data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The mask code is defined as follows : Code Mask Comment ---- ----- ------- 0 255.255.255.255 host route (default) 32 0.0.0.0 default route 31 128.0.0.0 . . . 8 255.255.255.0 . . . 1 255.255.255.254 The above table is determined : mask = -1 << mask code The route/path MTU cache entry is identified by the final destination AND the mask. 3. Router specification The router in addition to returning the next hop MTU (as is done now) returns the mask code. The mask code as per the table 1, informs the end host of the net mask used by the router in determining the path to the final destination. IPv4 Currently the routers return 0 in the 16 bits used by the mask code. Hence a router implementing the existing ICMP Datagram too Big message will be interpreted exactly as now ie. create a host specific route (or Path MTU cache). IPv6 Currently the 8 bits for the icmp code are unused. Hence a router implementing the existing ICMP Datagram too Big message will be interpreted exactly as now ie. create a host specific route (or Path MTU cache). 4. Impact on host implementation of Path MTU Discovery When a host receives Datagram Too Big message for a connection it has no way of knowing whether the subnet the peer belongs to is behind the bottleneck. As a result the host is forced to create a path specifically for the peer. This information cannot be shared with another connection attempted to another host which may be on the same sub network and behind the same bottleneck. With the modification suggested in section 2 such sharing of information becomes possible. 4.1 Case 1 Consider two different connections endpoint Ca to Cx and Cb to Cy. +----+ +------+ M1 +------+ +------+ +----+ | Ca |----->| Ra |----->| Rb |----->| Rc |---->| Cx | | Cb | | | | |----->| |--+ +____+ +____+ +------+ +------+ +------+ | | | V +------+ | Cy | +------+ fig 1 The first MTU reduction occurs on the path from Ra to Rb. Instead of creating a separate per path route for both Ca and Cb the host may keep both the connections using the same route (or any other cache to store the pmtu). Currently the host will have to create two entries, one for each connection. 4.2 Case 2 Routing change occurs such that the PMTU for Cb is M2 +----+ +------+ M1 +------+ +------+ +----+ | Ca |----->| Ra |----->| Rb |----->| Rc |---->| Cx | | Cb | | | | | | |--+ +____+ +____+ +------+ +------+ +------+ | | |M2 V +------+ | Cy | +------+ fig 2 A Datagram too Big message will be received for the connection Ca to Cb. The host can utilize the information in the 'mask code' to create a more specific route. The above two cases can be extended to 1000s to connections on the two paths considered. 4.3 Case 3 +----+ +------+ +------+ +------+ M1 +----+ | Ca |----->| Ra |----->| Rb |----->| Rc |---->| Cx | | Cb | | | | | | |--+ +____+ +____+ +------+ +------+ +------+ | | | Ha V +------+ | Cy | +------+ Hy fig 3 The route to Cx from Rc is determined by the route x.y.255.255 but the path to Cy is determined by x.y.z.255 (considering an IPv4 inter network. The PMTU to Cx is determined by M1 at Rc but the path to Cy is the same as the first hop MTU all the way from the host Ha. If the connection Ca to Cx is made first the Ha will have an entry corresponding to Ha to the network x.y.255.255. This will cause the connection to Cy to use the same information as determined for the connection to Cx. Similar situation can occur if the topology change occurred while the connections Ca-Cx, Cb-Cy were active as in fig 1 to fig 3. Since the information is shared between the connections it is possible that at the 10 minute interval (as suggested in [1] and [2]) host Ha may fail to determine the increased Path MTU to Hy. This problem can be avoided if the end hosts implement the policy: . on a new connection use a non-PMTU discovered route/path. . at every probe time (if the host has data to send) use the original route. This will cause a rediscovery of the paths. 4.4 Case 4 If the Datagram Too Big message returns the code indicating that the router used the default route, it may be taken equivalent to the indication of the use of a host route. If the information returned indicates a more general route than the route that was used then the information must be discarded and it be considered to be a host route mask. The local routing information is received using a routing protocol or set up by an administrator and must not be overridden. 5. Security considerations If the Datagram Too Big message returns a more general route than was used by the host, the indication is taken equivalent to the host route mask. This blocks the host from being fed faulty network information. The host may however be sent Datagram Too Big messages indicating the default route. The end host will end up creating host routes instead of subnet routes. This is no different from what happens now. A code that indicates a more precise route does not have any effect on theflow of data or the path MTU information related to the path. 6. References [1] J.Mogul, S.Deering. Path MTU Discovery, RFC 1191, November 1990. [2] J. McCann, S. Deering, J. Mogul. Path MTU Discovery for IP version 6. RFC 1981, August 1996 [3] S. Deering, R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification" RFC 1883, December 1995. [4] Conta, A., and S. Deering, "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification", RFC 1885, December 1995. 7. Author's Address Vivek Kashyap Sequent Computer Systems, Inc. 15450, SW Koll Parkway Beaverton, OR 97006 ph. 503 - 578 3422 email: viv@sequent.com