Internet DRAFT - draft-templin-intarea-parcels
draft-templin-intarea-parcels
Network Working Group F. L. Templin, Ed.
Internet-Draft Boeing Research & Technology
Updates: RFC2675, RFC9268 (if approved) 6 April 2023
Intended status: Standards Track
Expires: 8 October 2023
IP Parcels and Advanced Jumbos
draft-templin-intarea-parcels-62
Abstract
IP packets (both IPv4 and IPv6) contain a single unit of transport
layer protocol data which becomes the retransmission unit in case of
loss. Transport layer protocols including the Transmission Control
Protocol (TCP) and reliable transport protocol users of the User
Datagram Protocol (UDP) prepare data units known as "segments", with
individual IP packets including only a single segment. This document
presents new constructs known as "IP Parcels" and "Advanced Jumbos".
IP parcels permit a single packet to carry multiple transport layer
protocol segments in a "packet-of-packets", while advanced jumbos
provide significant operational advantages over standard jumbograms
for carrying truly large singleton segments. IP parcels and advanced
jumbos provide essential building blocks for improved performance,
efficiency and integrity while encouraging larger Maximum
Transmission Units (MTUs) in the Internet.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 8 October 2023.
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved.
Templin Expires 8 October 2023 [Page 1]
Internet-Draft IP Parcels April 2023
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Background and Motivation . . . . . . . . . . . . . . . . . . 7
4. IP Parcel Formation . . . . . . . . . . . . . . . . . . . . . 9
4.1. TCP Parcels . . . . . . . . . . . . . . . . . . . . . . . 13
4.2. UDP Parcels . . . . . . . . . . . . . . . . . . . . . . . 15
4.3. Calculating J and K . . . . . . . . . . . . . . . . . . . 15
5. Transmission of IP Parcels . . . . . . . . . . . . . . . . . 16
5.1. Packetization over Non-Parcel Links . . . . . . . . . . . 18
5.2. Parcellation over Parcel-capable Links . . . . . . . . . 20
5.3. OMNI Interface Parcellation and Reunification . . . . . . 21
5.4. Final Destination Restoration/Reunification . . . . . . . 23
5.5. Parcel/Jumbo Reports . . . . . . . . . . . . . . . . . . 24
5.6. Parcel Path Probing . . . . . . . . . . . . . . . . . . . 25
5.7. Integrity . . . . . . . . . . . . . . . . . . . . . . . . 30
6. Advanced Jumbos . . . . . . . . . . . . . . . . . . . . . . . 33
7. Minimal IP Parcels and Jumbograms . . . . . . . . . . . . . . 36
8. Implementation Status . . . . . . . . . . . . . . . . . . . . 38
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 38
10. Security Considerations . . . . . . . . . . . . . . . . . . . 39
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 39
12. References . . . . . . . . . . . . . . . . . . . . . . . . . 40
12.1. Normative References . . . . . . . . . . . . . . . . . . 40
12.2. Informative References . . . . . . . . . . . . . . . . . 41
Appendix A. TCP Extensions for High Performance . . . . . . . . 43
Appendix B. Extreme L Value Implications . . . . . . . . . . . . 44
Appendix C. Additional Parcel/Jumbo Probe Considerations . . . . 45
Appendix D. IP Parcel and Advanced Jumbo Futures . . . . . . . . 46
Appendix E. Change Log . . . . . . . . . . . . . . . . . . . . . 48
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 48
Templin Expires 8 October 2023 [Page 2]
Internet-Draft IP Parcels April 2023
1. Introduction
IP packets (both IPv4 [RFC0791] and IPv6 [RFC8200]) contain a single
unit of transport layer protocol data which becomes the
retransmission unit in case of loss. Transport layer protocols such
as the Transmission Control Protocol (TCP) [RFC9293] and reliable
transport protocol users of the User Datagram Protocol (UDP)
[RFC0768] (including QUIC [RFC9000], LTP [RFC5326] and others)
prepare data units known as "segments", with individual IP packets
including only a single segment. This document presents a new
construct known as the "IP Parcel" which permits a single packet to
carry multiple transport layer protocol segments. This essentially
creates a "packet-of-packets" with the full {TCP,UDP}/IP headers
appearing only once but with possibly more than one segment.
Transport layer protocol entities form parcels by preparing a data
buffer (or buffer chain) beginning with an Integrity Block of at most
256 2-octet Checksums followed by their corresponding transport layer
protocol segments that can be broken out into individual packets and/
or smaller sub-parcels if necessary. All segments except the final
one must be equal in length and no larger than 65535 octets (minus
headers), while the final segment must not be larger than the others
but may be smaller. The transport layer protocol entity then
delivers the buffer(s), number of segments and non-final segment size
to the network layer which copies the buffer(s) into the body of a
parcel then includes a {TCP,UDP} header and an IP header plus
extensions that identify this as a parcel and not an ordinary packet.
The network layer then forwards each parcel over consecutive parcel-
capable links in a path until they arrive at a router with a next hop
link that does not support parcels, a parcel-capable link with a size
restriction, or an ingress middlebox Overlay Multilink Network (OMNI)
Interface [I-D.templin-intarea-omni] that spans intermediate
Internetworks using adaptation layer encapsulation and fragmentation.
In the first case, the original source or next hop router applies
packetization to break the parcel into individual IP packets. In the
second case, the source/router applies network layer parcellation to
form smaller sub-parcels. In the final case, the OMNI interface
applies adaptation layer parcellation to form smaller sub-parcels if
necessary then applies adaptation layer encapsulation and
fragmentation if necessary before forwarding.
These adaptation layer sub-parcels may then be reunified into one or
more larger sub-parcels by an egress middlebox OMNI interface which
either delivers them locally or forwards them over additional parcel-
capable links in the network path to the final destination. The
final destination can then apply network layer reunification (or
restoration) to concatenate elements of the same original parcel into
Templin Expires 8 October 2023 [Page 3]
Internet-Draft IP Parcels April 2023
a single unit so as to present the largest possible number of
segments to the transport layer in a single system call. Reordering
and even loss or damage of individual segments within the network is
therefore possible, but what matters is that the parcels delivered to
the final destination's transport layer should be the largest
practical size for best performance and that loss or receipt of
individual segments (and not parcel size) determines the
retransmission unit.
This document further specifies an "advanced jumbo" service that
provides useful extensions beyond the "basic" IPv6 jumbogram service
defined in [RFC2675]. Advanced jumbos are defined for both IP
protocol versions and provide end systems and routers with a more
robust service when the transmission of truly large singleton
segments is necessary.
The following sections discuss rationale for creating and shipping IP
parcels and advanced jumbos as well as actual protocol constructs and
procedures involved. IP parcels and advanced jumbos provide
essential building blocks for improved performance, efficiency and
integrity while encouraging larger Maximum Transmission Units (MTUs).
These services will further inspire future innovation in
applications, transport protocols, operating systems, network
equipment and data links in ways that will transform the Internet
architecture.
2. Terminology
The Oxford Languages dictionary defines a "parcel" as "a thing or
collection of things wrapped in paper in order to be carried or sent
by mail". Indeed, there are many examples of parcel delivery
services worldwide that provide an essential transit backbone for
efficient business and consumer transactions.
In this same spirit, an "IP parcel" is simply a collection of at most
256 transport layer protocol segments wrapped in an efficient package
for transmission and delivery (i.e., a "packet-of-packets") while a
"singleton IP parcel" is simply a parcel that contains a single
segment. IP parcels are distinguished from ordinary packets and
jumbograms through the constructs specified in this document.
The IP parcel construct is defined for both IPv4 and IPv6. Where the
document refers to "IPv4 header length", it means the total length of
the base IPv4 header plus all included options, i.e., as determined
by consulting the Internet Header Length (IHL) field. Where the
document refers to "IPv6 header length", however, it means only the
length of the base IPv6 header (i.e., 40 octets), while the length of
any extension headers is referred to separately as the "IPv6
Templin Expires 8 October 2023 [Page 4]
Internet-Draft IP Parcels April 2023
extension header length". Finally, the term "IP header plus
extensions" refers generically to an IPv4 header plus all included
options or an IPv6 header plus all included extension headers.
The term "advanced jumbo" refers to a new type of IP jumbogram
defined for both IP protocol versions and derived from "basic" IPv6
jumbograms as defined in [RFC2675]. Advanced jumbos include a 32-bit
Jumbo Payload Length field the same as for basic IPv6 jumbograms, but
are differentiated from parcels and other jumbogram types by
including the "Type" value '1' in the IP {Total, Payload} Length
field. Advanced jumbos can be in either minimal or expanded format,
with expanded format including additional Jumbo Payload option
control information.
Where the document refers to "{TCP,UDP} header length", it means the
length of either the TCP header plus options (20 or more octets) or
the UDP header (8 octets). It is important to note that only a
single IP header and a single full {TCP,UDP} header appears in each
parcel regardless of the number of segments included. This
distinction often provides a significant savings in overhead made
possible only by IP parcels.
Where the document refers to checksum calculations, it means the
standard Internet checksum unless otherwise specified. The same as
for TCP [RFC9293], UDP [RFC0768] and IPv4 [RFC0791], the standard
Internet checksum is defined as (sic) "the 16-bit one's complement of
the one's complement sum of all (pseudo-)headers plus data, padded
with zero octets at the end (if necessary) to make a multiple of two
octets". A notional Internet checksum algorithm can be found in
[RFC1071], while practical implementations require special attention
to byte ordering "endianness" to ensure interoperability between
diverse architectures.
The terms "application layer (L5 and higher)", "transport layer
(L4)", "network layer (L3)", "(data) link layer (L2)" and "physical
layer (L1)" are used consistently with common Internetworking
terminology, with the understanding that reliable delivery protocol
users of UDP are considered as transport layer elements. The OMNI
specification further defines an "adaptation layer" logically
positioned below the network layer but above the link layer (which
may include physical links and Internet- or higher-layer tunnels).
The adaptation layer is simply known as "the layer below L3 but above
L2" and does not assign a layer number itself. A network interface
is a node's attachment to a link (via L2), and an OMNI interface is
therefore a node's attachment to an OMNI link (via the adaptation
layer).
Templin Expires 8 October 2023 [Page 5]
Internet-Draft IP Parcels April 2023
The term "parcel/jumbo-capable link/path" refers to paths that
transit interfaces to adaptation and/or link layer media (either
physical or virtual) capable of transiting {TCP,UDP}/IP packets that
employ the parcel/jumbo constructs specified in this document. The
source and each router in the path has a "next hop link" that
forwards parcels/jumbos toward the final destination, while each
router and the final destination has a "previous hop link" that
accepts en route parcels/jumbos. Each next hop link must be capable
of forwarding parcels/jumbos (after first applying parcellation if
necessary) with segment lengths no larger than can transit the link.
Currently only the OMNI link satisfies these properties, but new and
existing link types are also encouraged to support parcels and
advanced jumbos.
The term "5-tuple" refers to a transport layer protocol entity
identifier that includes the network layer (source address,
destination address, source port, destination port, protocol number).
The term "3-tuple" refers to a network layer parcel entity identifier
that includes the adaptation layer (source address, destination
address, Parcel ID).
The term "Maximum Transmission Unit (MTU)" is widely understood in
Internetworking terminology to mean the largest packet size that can
transit a single link ("link MTU") or an entire path ("path MTU")
without requiring network layer IP fragmentation. If the MTU value
returned during parcel path qualification is larger than 65535 (plus
the length of the parcel headers), it determines the maximum parcel
size that can transit the link/path without requiring a router to
perform packetization/parcellation. If the MTU is 65535 or smaller,
the value instead determines the "Maximum Segment Size (MSS)" for the
leading portion of the path up to a router that cannot forward the
parcel further. (Note that this size may still be larger than the
MSS that can transit the remainder of the path to the final
destination, which can only be determined through additional
probing.)
The terms "packetization" and "restoration" refer to a network layer
process in which the original source or a router on the path breaks a
parcel out into individual IP packets that can transit the remainder
of the path without loss due to a size restriction. The final
destination then restores the combined packet contents into a parcel
before delivery to the transport layer. In current practice,
packetization/restoration are considered to be one and the same as
Generic Segmentation/Receive Offload (GSO/GRO).
The terms "parcellation" and "reunification" refer to either network
layer or adaptation layer processes in which the original source or a
router on the path breaks a parcel into smaller sub-parcels that can
Templin Expires 8 October 2023 [Page 6]
Internet-Draft IP Parcels April 2023
transit the path without loss due to a size restriction. These sub-
parcels are then reunified into larger (sub-)parcels before delivery
to the transport layer. As a network layer process, the sub-parcels
resulting from parcellation may only be reunified at the final
destination. As an adaptation layer process, the resulting sub-
parcels may be first reunified at an adaptation layer egress node
then possibly further reunified by the network layer of the final
destination.
The parcel sizing variables "J", "K", "L" and "M" are cited
extensively throughout the document. "J" denotes the number of non-
final segments included in the parcel, "L" is the length of each non-
final segment, "K" is the length of the final segment and "M" is
termed the "Parcel Payload Length".
Automatic Extended Route Optimization (AERO)
[I-D.templin-intarea-aero] and the Overlay Multilink Network
Interface (OMNI) [I-D.templin-intarea-omni] provide an architectural
framework for transmission of IP parcels and advanced jumbos over one
or more concatenated Internetworks. AERO/OMNI will provide an
operational environment for IP parcels beginning from the earliest
deployment phases and extending indefinitely to accommodate
continuous future growth. As more and more parcel/jumbo-capable
links are deployed (e.g., in data centers, edge networks, space-
domain, and other high data rate services) AERO/OMNI will continue to
provide an essential service for Internetworking performance
maximization.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119][RFC8174] when, and only when, they appear in all
capitals, as shown here.
3. Background and Motivation
Studies have shown that applications can improve their performance by
sending and receiving larger packets due to reduced numbers of system
calls and interrupts as well as larger atomic data copies between
kernel and user space. Larger packets also result in reduced numbers
of network device interrupts and better network utilization (e.g.,
due to header overhead reduction) in comparison with smaller packets.
A first study [QUIC] involved performance enhancement of the QUIC
protocol [RFC9000] using the linux Generic Segment/Receive Offload
(GSO/GRO) facility. GSO/GRO provides a robust service that has shown
significant performance increases based on a multi-segment transfer
capability between the operating system kernel and QUIC applications.
Templin Expires 8 October 2023 [Page 7]
Internet-Draft IP Parcels April 2023
GSO/GRO performs fragmentation and reassembly at the transport layer
with the transport protocol segment size limited by the path MTU
(typically 1500 octets or smaller in today's Internet).
A second study [I-D.templin-dtn-ltpfrag] showed that GSO/GRO also
improves performance for the Licklider Transmission Protocol (LTP)
[RFC5326] used for the Delay Tolerant Networking (DTN) Bundle
Protocol [RFC9171] for segments larger than the actual path MTU
through the use of OMNI interface encapsulation and fragmentation.
Historically, the NFS protocol also saw significant performance
increases using larger (single-segment) UDP datagrams even when IP
fragmentation is invoked, and LTP still follows this profile today.
Moreover, LTP shows this (single-segment) performance increase
profile extending to the largest possible segment size which suggests
that additional performance gains are possible using (multi-segment)
IP parcels that approach or even exceed 65535 octets in total length.
TCP also benefits from larger packet sizes and efforts have
investigated TCP performance using jumbograms internally with changes
to the linux GSO/GRO facilities [BIG-TCP]. The approach proposed to
use the Jumbo Payload option internally and to allow GSO/GRO to use
buffer sizes larger than 65535 octets, but with the understanding
that links that support jumbograms natively are not yet widely
available. Hence, IP parcels provide a packaging that can be
considered in the near term under current deployment limitations.
A limiting consideration for sending large packets is that they are
often lost at links with MTU restrictions, and the resulting Packet
Too Big (PTB) message [RFC1191][RFC8201] may be lost somewhere in the
return path to the original source. This "Path MTU black hole"
condition can degrade performance unless robust path probing
techniques are used, however the best case performance always occurs
when loss of packets due to size restrictions is minimized.
These considerations therefore motivate a design where transport
protocols can employ segment sizes as large as 65535 octets (minus
headers), while parcels that carry multiple segments may themselves
be significantly larger. This would allow the receiving transport
layer protocol entity to process multiple segments in parallel
instead of one at a time per existing practices. Parcels therefore
support improvements in performance, integrity and efficiency for the
original source, final destination and networked path as a whole.
This is true even if the network and lower layers need to apply
packetization/restoration, parcellation/reunification and/or
fragmentation/reassembly.
Templin Expires 8 October 2023 [Page 8]
Internet-Draft IP Parcels April 2023
An analogy: when a consumer orders 50 small items from a major online
retailer, the retailer does not ship the order in 50 separate small
boxes. Instead, the retailer packs as many of the small items as
possible into one or a few larger boxes (i.e., parcels) then places
the parcels on a semi-truck or airplane. The parcels may then pass
through one or more regional distribution centers where they may be
repackaged into different parcel configurations and forwarded further
until they are finally delivered to the consumer. But most often,
the consumer will only find one or a few parcels at their doorstep
and not 50 separate small boxes. This flexible parcel delivery
service greatly reduces shipping and handling cost for all including
the retailer, regional distribution centers and finally the consumer.
4. IP Parcel Formation
A transport protocol entity identified by its 5-tuple forms a parcel
body when it prepares a data buffer (or buffer chain) containing an
Integrity Block of at most 256 2-octet Checksums followed by their
corresponding transport layer protocol segments, with each TCP non-
first segment preceded by a 4-octet Sequence Number header. All non-
final segments MUST be equal in length while the final segment MUST
NOT be larger and MAY be smaller. The number of non-final segments
is represented as J; therefore the total number of segments is
represented as (J + 1).
The non-final segment size L MUST be set to a value between 512 and
65535 octets and SHOULD be no larger than the minimum of 65535 octets
and the path MTU, minus the length of the {TCP,UDP} header (plus
options), minus the length of the IP header (plus options/
extensions), minus 2 octets for the per-segment Checksum (see:
Appendix B). The final segment length K MUST NOT be larger than L
but MAY be smaller. The transport layer protocol entity then
presents the buffer(s) and size L to the network layer, noting that
the combined buffer length(s) may exceed 65535 octets if there are
sufficient segments of a large enough size.
If the next hop link is not parcel capable, the network layer
performs packetization to configure each segment as an individual IP
packet as discussed in Section 5.1. If the next hop link is parcel
capable, the network layer instead forms a parcel by appending a
single full {TCP,UDP} header (plus options) and a single full IP
header (plus options/extensions). The network layer finally includes
a specially-formatted "Parcel Payload" option as an extension to the
IP header of each parcel prior to transmission over a network
interface.
Templin Expires 8 October 2023 [Page 9]
Internet-Draft IP Parcels April 2023
The Parcel Payload option formats for both IP protocol versions are
derived from the Jumbo Payload option specified in [RFC2675] and
appear as shown in Figure 1:
IPv4 Parcel Payload Option Format
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Option Type | Opt Data Len | Code | Check |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Index | Parcel Payload Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Path MTU (PMTU) |S|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
IPv6 Parcel Payload Option Format
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Option Type | Opt Data Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Index | Parcel Payload Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Path MTU (PMTU) |S|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 1: Parcel Payload Option
For IPv4, the network layer includes the Parcel Payload option as an
IPv4 header option with Option Type set to '00001011' and Option Data
Length set to '00010000'. (Note: the length also distinguishes this
type from its obsoleted use as the IPv4 "Probe MTU" option
[RFC1063].) The network layer sets Code to 255 and sets Check to the
same value that will appear in the IPv4 header TTL field upon
transmission to the next hop. The network layer also sets Parcel
Payload Length to a 3-octet value M that encodes the length of the
IPv4 header plus the length of the {TCP,UDP} header plus the combined
length of the Integrity Block plus all concatenated segments. The
network layer then sets the IPv4 header DF bit to 1 and Total Length
field to the non-final segment size L.
For IPv6, the network layer includes the Parcel Payload option as the
first option in the first IPv6 Hop-by-Hop Options header, and with
Option Type set to '11000010' and Option Data Length set to
'00001100'. (Note: the most significant 3 Option Type bits are
maintained the same as for the IPv6 Jumbo Payload option, with the
understanding that nodes that recognize the Parcel Payload option
Templin Expires 8 October 2023 [Page 10]
Internet-Draft IP Parcels April 2023
will process the option consistently regardless of these bit
settings. For further Hop-by-Hop option processing considerations,
see: [I-D.ietf-6man-hbh-processing].) The network layer then sets
Parcel Payload Length to a 3-octet value M that encodes the lengths
of all IPv6 extension headers present plus the length of the
{TCP,UDP} header plus the combined length of the Integrity Block plus
all concatenated segments. The network layer also sets the IPv6
header Payload Length field to L.
For both IP protocol versions, the network layer then sets
Identification and PMTU as specified in Section 5. The network layer
next sets Index to an ordinal segment index value between 0 and 255
and the "More (S)egments" flag to 1 for non-final sub-parcels or 0
for the final (sub-)parcel. (Note that Index values other than 0
identify the initial segment index in non-first sub-parcels of a
larger original parcel, whereas first (sub-)parcels always set Index
to 0.)
Following transport and network layer processing, {TCP,UDP}/IP
parcels therefore have the structures shown in Figure 2:
Templin Expires 8 October 2023 [Page 11]
Internet-Draft IP Parcels April 2023
TCP/IP Parcel Structure UDP/IP Parcel Structure
+------------------------------+ +------------------------------+
| | | |
~ IP Hdr plus extensions ~ ~ IP Hdr plus extensions ~
| | | |
+------------------------------+ +------------------------------+
| | | |
~ TCP header (plus options) ~ ~ UDP header ~
| (Includes Sequence Number 0) | | |
+------------------------------+ +------------------------------+
| | | |
~ Integrity Block ~ ~ Integrity Block ~
| | | |
+------------------------------+ +------------------------------+
~ ~ ~ ~
~ Segment 0 (L-4 octets) ~ ~ Segment 0 (L octets) ~
+------------------------------+ +------------------------------+
~ Sequence Number 1 followed ~ ~ ~
~ by Segment 1 (L octets) ~ ~ Segment 1 (L octets) ~
+------------------------------+ +------------------------------+
~ Sequence Number 2 followed ~ ~ ~
~ by Segment 2 (L octets) ~ ~ Segment 2 (L octets) ~
+------------------------------+ +------------------------------+
~ ... ~ ~ ... ~
~ ... ~ ~ ... ~
+------------------------------+ +------------------------------+
~ Sequence Number J followed ~ ~ ~
~ by Segment J (K octets) ~ ~ Segment J (K octets) ~
+------------------------------+ +------------------------------+
Figure 2: {TCP,UDP}/IP Parcel Structure
where the number of non-final segments is J, L is the length of each
non-final segment (between 512 and 65535 octets), and K is the length
of the final segment which MUST be no larger than L.
The {TCP,UDP}/IP header is immediately followed by an Integrity Block
containing (J + 1) 2-octet Checksums concatenated in numerical order
as shown in Figure 3:
Templin Expires 8 October 2023 [Page 12]
Internet-Draft IP Parcels April 2023
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum (0) | Checksum (1) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum (2) | ... ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... ~
~ ... ... ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum (J-1) | Checksum (J) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3: Integrity Block Format
The Integrity Block is then followed by (J + 1) transport layer
segments. For TCP, the TCP header Sequence Number field encodes a
4-octet starting sequence number for the first segment only, while
each additional segment is preceded by its own 4-octet Sequence
Number field. For this reason, the length of the first segment is
only (L-4) octets since the 4-octet TCP header Sequence Number field
applies to that segment. (All non-first TCP segments instead begin
with their own Sequence Number headers, with the 4-octet length
included in L and K.)
Note: Per-segment Checksums appear in a contiguous Integrity Block
immediately following the {TCP,UDP}/IP headers instead of inline with
the parcel segments to greatly increase the probability that they
will appear in the contiguous head of a kernel receive buffer even if
the parcel was subject to OMNI interface IPv6 fragmentation. This
condition may not always hold if the IPv6 fragments also incur IPv4
encapsulation and fragmentation over paths that transit IPv4 links
with small MTUs. Even then, only the fragmented Integrity Block
(i.e., and not the entire parcel) may need to be pulled/copied into
the contiguous head of a kernel receive buffer.
Note: For IPv4 parcels, the first 2 octets of the Parcel Payload
option include Code and Check fields in case a router on the path
overwrites the values in a wayward attempt to implement [RFC1063].
IPv4 parcel recipients should therefore regard an incorrect Code or
Check value as evidence that the field was accidentally or
intentionally corrupted by a previous hop node.
4.1. TCP Parcels
A TCP Parcel is an IP Parcel that includes an IP header plus
extensions with a Parcel Payload option formed as shown in Section 4
with Parcel Payload Length encoding a value no larger than 16,777,215
(2**24 - 1) octets. The IP header plus extensions is then followed
by a TCP header plus options (20 or more octets), which is then
followed by an Integrity Block with (J + 1) consecutive 2-octet
Templin Expires 8 October 2023 [Page 13]
Internet-Draft IP Parcels April 2023
Checksums. The Integrity Block is then followed by (J + 1)
consecutive segments, where the first segment is (L-4) octets in
length and uses the 4-octet sequence number found in the TCP header,
each intermediate segment is L octets in length (including its own
4-octet Sequence Number header) and the final segment is K octets in
length (including its own 4-octet Sequence Number header). The value
L is encoded in the IP header {Total, Payload} Length field while the
overall length of the parcel is determined by the Parcel Payload
Length M as discussed above.
The source prepares TCP Parcels in an alternative adaptation of TCP
jumbograms [RFC2675]. The source calculates a checksum of the TCP
header plus IP pseudo-header only (see: Section 5.7), but with the
TCP header Sequence Number field temporarily set to 0 during the
calculation since the true sequence number will be included as an
integrity pseudo header for the first segment. The source then
writes the calculated value in the TCP header Checksum field as-is
(i.e., without converting calculated '0' values to 'ffff') and
finally re-writes the actual sequence number back into the Sequence
Number field. (Nodes that verify the header checksum first perform
the same operation of temporarily setting the Sequence Number field
to 0 and then resetting to the actual value following checksum
verification.)
The source then calculates the checksum of the first segment
beginning with the sequence number found in the full TCP header as a
4-octet pseudo-header then extending over the remaining (L-4) octet
length of the segment. The source next calculates the checksum for
each L octet intermediate segment independently over the length of
the segment (beginning with its sequence number), then finally
calculates the checksum of the K octet final segment (beginning with
its sequence number). As the source calculates each segment(i)
checksum (for i = 0 thru J), it writes the value into the
corresponding Integrity Block Checksum(i) field as-is.
Note: The parcel TCP header Source Port, Destination Port and (per-
segment) Sequence Number fields apply to all parcel segments, while
the TCP control bits and all other fields apply only to the first
segment (i.e., "segment(0)"). Therefore, only parcel segment(0) may
be associated with control bit settings while all other segment(i)'s
must be simple data segments.
See Appendix A for additional TCP considerations. See Section 5.7
for additional integrity considerations.
Templin Expires 8 October 2023 [Page 14]
Internet-Draft IP Parcels April 2023
4.2. UDP Parcels
A UDP Parcel is an IP Parcel that includes an IP header plus
extensions with a Parcel Payload option formed as shown in Section 4
with Parcel Payload Length encoding a value no larger than 16,777,215
(2**24 - 1) octets. The IP header plus extensions is then followed
by an 8-octet UDP header followed by an Integrity Block with (J + 1)
consecutive 2-octet Checksums followed by (J + 1) transport layer
segments. Each segment must begin with a transport-specific start
delimiter (e.g., a segment identifier) included by the transport
layer user of UDP. The length of the first segment L is encoded in
the IP {Total, Payload} Length field while the overall length of the
parcel is determined by the Parcel Payload Length M as discussed
above.
The source prepares UDP Parcels in an alternative adaptation of UDP
jumbograms [RFC2675]. The source first MUST set the UDP header
length field to 0, then calculates the checksum of the UDP header
plus IP pseudo-header (see: Section 5.7) and writes the calculated
value in the UDP header Checksum field as-is (i.e., without
converting calculated '0' values to 'ffff').
The source then calculates a separate checksum for each segment for
which checksums are enabled independently over the length of the
segment. As the source calculates each segment(i) checksum (for i =
0 thru J), it writes the value into the corresponding Integrity Block
Checksum(i) field with calculated '0' values converted to 'ffff'; for
segments with checksums disabled, the source instead writes the value
'0'.
See: Section 5.7 for additional integrity considerations.
4.3. Calculating J and K
The IP parcel source unambiguously encodes the values L and M in the
corresponding header fields as specified above. The values J and K
are not encoded in header fields and must therefore be calculated by
intermediate nodes and final destinations as follows:
Templin Expires 8 October 2023 [Page 15]
Internet-Draft IP Parcels April 2023
/* L must be at least 512; T is temporary length;
H is {TCP,UDP}/IP header/extension lengths;
For TCP, segment 0 Sequence Number is 4 octets;
For each segment, Checksum is 2 octets */
if ((L < 512) || ((T = (M - H)) <= 0))
drop parcel;
if (TCP) T += 4;
if ((J = (T / (L + 2))) > 256)
drop parcel;
if ((K = (T % (L + 2))) == 0) {
J--; K = L;
} else {
if ((J > 255) || ((K -= 2) <= 0))
drop parcel;
}
if ((TCP) && (J == 0) && ((K -= 4) <= 0))
drop parcel;
Figure 4: Calculating J and K
5. Transmission of IP Parcels
During {TCP,UDP} parcel assembly, the network layer of the source
fully populates IP header fields including the source address,
destination address and Parcel Payload option as discussed above.
The source also sets IP {Total, Payload} Length to L (between 512 and
65535) to distinguish the parcel from other jumbogram types (see:
Section 6).
The network layer of the source also maintains a randomly-initialized
32-bit cached Identification value for each destination. For each
parcel transmission, the source sets Identification to the current
cached value for this destination and increments the cached value by
1 (modulo 2**32) for each successive transmission (the source can
later reset the cached value to a new random number, e.g., to
maintain an unpredictable profile).
The network layer of the source finally sets the Parcel Payload
option PMTU to the value '0' (unless the parcel is also being used as
a probe - see: Section 5.6) then presents the parcel to an interface
for transmission to the next hop. For ordinary interface attachments
to parcel-capable links, the source simply admits each parcel into
the interface the same as for any IP packet where it may be forwarded
Templin Expires 8 October 2023 [Page 16]
Internet-Draft IP Parcels April 2023
by one or more routers over additional consecutive parcel-capable
links possibly even traversing the entire forward path to the final
destination. If any node in the path does not recognize the parcel
construct, it drops the parcel and may return an ICMP "Parameter
Problem" message.
When the next hop link does not support parcels at all, or when the
next hop link is parcel-capable but configures an MTU that is too
small to pass the entire parcel, the source breaks the parcel up into
individual IP packets (in the first case) or into smaller sub-parcels
(in the second case). In the first case, the source can apply
"packetization" using Generic Segment Offload (GSO), and the final
destination can apply "restoration" using Generic Receive Offload
(GRO) to deliver the largest possible parcel buffer(s) to the
transport layer. In the second case, the source can apply
"parcellation" to break the parcel into sub-parcels with each
containing the same Identification value and with the S flag set
appropriately. The final destination can then apply "reunification"
to deliver the largest possible parcel buffer(s) to the transport
layer. In all other ways, the source processes of breaking a parcel
up into individual IP packets or smaller sub-parcels entail the same
considerations as for a router on the path that invokes these
processes as discussed in the following subsections.
Parcel probes that test the forward path's ability to pass parcels
set the PMTU field to a non-zero value as discussed in Section 5.6.
Each router in the path then rewrites PMTU in a similar fashion as
for [RFC1063][RFC9268]. Specifically, each router compares the
parcel PMTU value with the next hop link MTU in the parcel path and
MUST (re)set PMTU to the most significant 31 bits of the minimum
value. Note that the fact that the parcel transited a previous hop
link should provide sufficient evidence of forward progress since
parcel path MTU determination is unidirectional in the forward path
only. However, nodes can also include the previous hop link MTU in
their minimum PMTU calculations in case the link may have an ingress
size restriction (such as a receive buffer limitation). Each parcel
also includes one or more transport layer segments corresponding to
the 5-tuple for the flow, which may include {TCP,UDP} segment size
probes used for packetization layer path MTU discovery
[RFC4821][RFC8899]. (See: Section 5.6 for further details on parcel
path probing.)
When a router receives an IPv4 parcel it first compares Code with 255
and Check with the IPv4 header TTL; if either value differs, the
router drops the parcel and returns a negative Jumbo Report (see:
Section 5.5) subject to rate limiting. For all other IP parcels, the
router next compares the value L with the next hop link MTU. If the
next hop link is parcel capable but with MTU too small to pass a
Templin Expires 8 October 2023 [Page 17]
Internet-Draft IP Parcels April 2023
parcel with a single segment of length L the router discards the
parcel and returns a positive Jumbo Report (subject to rate limiting)
with MTU set to the next hop link MTU. If the next hop link is not
parcel capable and has an MTU too small to pass an individual IP
packet with a single segment of length L the router discards the
parcel and instead returns a positive Parcel Report (subject to rate
limiting) with MTU set to the next hop link MTU. For IPv4 parcels,
if the next hop link is parcel capable the router MUST reset Check to
the same value that would appear in the IPv4 header TTL field upon
transmission to the next hop.
If the router recognizes parcels but the next hop link in the path
does not, or if the entire parcel would exceed the next hop link MTU,
the router instead opens the parcel. The router then forwards each
enclosed segment in individual IP packets or in a set of smaller sub-
parcels that each contain a subset of the original parcel's segments.
If the next hop link is via an OMNI interface, the router instead
proceeds according to OMNI Adaptation Layer procedures. These
considerations are discussed in detail in the following sections.
5.1. Packetization over Non-Parcel Links
For transmission of individual IP packets over links that do not
support parcels, the source or router (i.e., the node) engages GSO to
perform packetization. The node first determines whether an
individual packet with segment of length L can fit within the next
hop link MTU. If an individual packet would be too large, the node
drops the parcel and returns a positive Parcel Report message
(subject to rate limiting) with MTU set to the next hop link MTU and
with the leading portion of the parcel beginning with the IP header
as the "packet in error". If an individual packet can be
accommodated, the node instead removes the Parcel Payload option,
sets aside and remembers the Integrity Block (and for TCP also sets
aside and remembers the Sequence Number header values of each non-
first segment) then copies the {TCP,UDP}/IP headers (but with the
Parcel Payload option removed) followed by segment(i) (for i= 0 thru
J) into 'i' individual IP packets ("packet(i)").
For each IP packet(i), the node then clears the TCP control bits in
all but packet(0), and includes only those TCP options that are
permitted to appear in data segments in all but packet(0) which may
also include control segment options (see: Appendix A for further
discussion). The node then sets IP {Total, Payload} Length for each
packet(i) based on the length of segment(i) according to the IP
protocol standards [RFC0791] [RFC8200].
Templin Expires 8 October 2023 [Page 18]
Internet-Draft IP Parcels April 2023
For each IPv6 packet(i), the node includes an "augmented" IPv6
Fragment Header as shown in Figure 5 and sets the Identification
field to the value found in the parcel header. The node then writes
the value 'i' in the Index field, sets the "(P)arcel" bit to 1 and
sets the "More (S)egments" bit to 1 for each non-final segment or 0
for the final segment (see below).
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | Index | Fragment Offset |P|S|M|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5: Augmented IPv6 Fragment Header
For each IPv4 packet(i), the node instead sets the Identification
field to the least significant 16 bits of the value found in the
parcel header and sets the (D)ont Fragment flag to '1'. For each IP
packet(i), the node then sets both the Fragment Offset field and
(M)ore fragments flag to '0' to produce an unfragmented IP packet
(IPv6 destinations will process these "atomic fragments" as whole
packets instead of admitting them into the reassembly cache, i.e.,
the same as for IPv4). The node then processes further according to
transport layer protocol conventions as follows.
For TCP, the node calculates the checksum up to the end of
packet(0)'s TCP/IP headers only according to [RFC9293] but with the
sequence number value saved and the field set to 0. The node then
adds Integrity Block Checksum(0) to the calculated value and writes
the sum into packet(0)'s TCP Checksum field. The node then resets
the Sequence Number field to packet(0)'s saved sequence number and
forwards packet(0) to the next hop. The node next calculates the
checksum of packet(1)'s TCP/IP headers with the Sequence Number field
set to 0 and saves the calculated value. In each non-first packet(i)
(for i = 1 thru J), the node then adds the saved value to Integrity
Block Checksum(i), writes the sum into packet(i)'s TCP Checksum
field, sets the TCP Sequence Number field to packet(i)'s sequence
number then forwards packet(i) to the next hop.
For UDP, the node sets the UDP length field according to [RFC0768] in
each packet(i) (for i= 0 thru J). If Integrity Block Checksum(i) is
0, the node then sets the UDP Checksum field to 0, forwards packet(i)
to the next hop and continues to the next. The node next calculates
the checksum over packet(i)'s UDP/IP headers only according to
[RFC0768]. If Integrity Block Checksum(i) is not 'ffff', the node
then adds the value to the header checksum; otherwise, the node re-
calculates the checksum for segment(i). If the re-calculated
segment(i) checksum value is 'ffff' or '0' the node adds the value to
Templin Expires 8 October 2023 [Page 19]
Internet-Draft IP Parcels April 2023
the header checksum; otherwise, it continues to the next packet(i).
The node finally writes the total checksum value into the packet(i)
UDP Checksum field (or writes 'ffff' if the total was '0') and
forwards packet(i) to the next hop.
Note: For each UDP packet(i), the node must recalculate the segment
checksum if Checksum(i) is 'ffff', since that value is shared by both
'0' and 'ffff' calculated checksums. If recalculating the checksum
produces an incorrect value, the node can optionally drop or forward
(noting that the forwarded packet would simply be discarded as an
error by the final destination). For each {TCP,UDP} packet(i), the
node can optionally re-calculate and verify the segment checksum
unconditionally before forwarding, but this may introduce
unacceptable delay and processing overhead.
Note: Packets resulting from packetization may be too large to
transit the remaining path to the final destination, such that a
router may drop the packet(s) and possibly also return an ordinary
ICMP PTB message. Since these messages cannot be authenticated or
may be lost on the return path, the original source should take care
in setting a segment size larger than the known path MTU unless as
part of an active probing service.
5.2. Parcellation over Parcel-capable Links
For transmission of smaller sub-parcels over parcel-capable links,
the source or router (i.e., the node) first determines whether a
single segment of length L can fit within the next hop link MTU if
packaged as a (singleton) sub-parcel. If a singleton sub-parcel
would too large, the node returns a positive Jumbo Report message
(subject to rate limiting) with MTU set to the next hop link MTU and
containing the leading portion of the parcel beginning with the IP
header, then drops the parcel. If the parcel can be accommodated,
the node instead employs network layer parcellation to break the
original parcel into smaller groups of segments that would fit within
the path MTU by determining the number of segments of length L that
can fit into each sub-parcel under the size constraints. For
example, if the node determines that each sub-parcel can contain 3
segments of length L, it creates sub-parcels with the first
containing Integrity Block Checksums/Segments 0-2, the second
containing 3-5, the third containing 6-8, etc., and with the final
containing any remaining Checksums/Segments.
If the original parcel's Parcel Payload option has S set to '0', the
node then sets S to '1' in all resulting sub-parcels except the last
(i.e., the one containing the final segment of length K, which may be
shorter than L) for which it sets S to '0'. If the original parcel
has S set to'1', the node instead sets S to '1' in all resulting sub-
Templin Expires 8 October 2023 [Page 20]
Internet-Draft IP Parcels April 2023
parcels including the last. The node next sets the Index field to
the value 'i' which is the ordinal number of the first segment
included in each sub-parcel. (In the above example, the first sub-
parcel sets Index to 0, the second sets Index to 3, the third sets
Index to 6, etc.). If another router further down the path toward
the final destination forwards the sub-parcel(s) over a link that
configures a smaller MTU, the router breaks it into even smaller sub-
parcels each with Index set to the ordinal number of the first
segment included.
The node next appends identical {TCP,UDP}/IP headers (including the
Parcel Payload option and any other extensions) to each sub-parcel
while resetting Index, S, {Total, Payload} Length (L) and Parcel
Payload Length (M) in each as discussed above. For TCP, the node
then clears the TCP control bits in all but the first sub-parcel and
includes only those TCP options that are permitted to appear in data
segments in all but the first sub-parcel (which may also include
control segment options). For both TCP and UDP, the node then resets
the {TCP,UDP} Checksum according to ordinary parcel formation
procedures (see above). The node then sets the TCP Sequence Number
field to the value that appears in the first sub-parcel segment while
removing the first segment's Sequence Number header (if present).
The node finally sets PMTU to the next hop link MTU then forwards
each (sub-)parcel over the parcel-capable next hop link.
5.3. OMNI Interface Parcellation and Reunification
For transmission of original parcels or sub-parcels over OMNI
interfaces, the node admits all parcels into the interface
unconditionally since the OMNI interface MTU is unrestricted. The
OMNI Adaptation Layer (OAL) of this First Hop Segment (FHS) OAL
source node then forwards the parcel to the next OAL hop which may be
either an intermediate node or a Last Hop Segment (LHS) OAL
destination. OMNI interface parcellation and reunification
procedures are specified in detail in the remainder of this section,
while parcel encapsulation and fragmentation procedures are specified
in [I-D.templin-intarea-omni].
When the OAL source forwards a parcel (whether generated by a local
application or forwarded over a network path that transited one or
more parcel-capable links), it first assigns a monotonically-
incrementing (modulo 255) adaptation layer "Parcel ID". If the
parcel is larger than the OAL maximum segment size of 65535 octets,
the OAL source first employs parcellation to break the parcel into
sub-parcels the same as for the network layer procedures discussed
above. This includes re-setting the Index, S, {Total, Payload}
Length (L) and Parcel Payload Length (M) fields in each sub-parcel
the same as specified in Section 5.2.
Templin Expires 8 October 2023 [Page 21]
Internet-Draft IP Parcels April 2023
The OAL source next assigns a different monotonically-incrementing
adaptation layer Identification value for each sub-parcel of the same
Parcel ID then performs adaptation layer encapsulation and
fragmentation and finally forwards each fragment to the next OAL hop
toward the OAL destination as necessary. (During encapsulation, the
OAL source examines the Parcel Payload option S flag to determine the
setting for the adaptation layer fragment header S flag according to
the same rules specified in Section 5.2.)
When the sub-parcels arrive at the OAL destination, it retains them
along with their Parcel ID and Identifications for a short time to
support reunification with peer sub-parcels of the same original
(sub-)parcel identified by the 3-tuple information corresponding to
the OAL source. This reunification entails the concatenation of
Checksums/Segments included in sub-parcels with the same Parcel ID
and with Identification values within 255 of one another to create a
larger sub-parcel possibly even as large as the entire original
parcel. The OAL destination concatenates each sub-parcel in
ascending Identification value order, while ensuring that any sub-
parcel with TCP control bits set appears as the first concatenated
element in a reunified larger parcel and any sub-parcel with S flag
set to '0' appears as the final concatenation. The OAL destination
then sets S to '0' in the reunified (sub-)parcel if and only if one
of its constituent elements also had S set to '0'; otherwise, it sets
S to '1'.
The OAL destination then appends a common {TCP,UDP}/IP header plus
extensions to each reunified sub-parcel while resetting Index, S,
{Total, Payload} Length (L) and Parcel Payload Length (M) in the
corresponding header fields of each. For TCP, if any sub-parcel has
TCP control bits set the OAL destination regards it as sub-parcel(0)
and uses its TCP header as the header of the reunified (sub-)parcel
with the TCP options including the union of the TCP options of all
reunified sub-parcels. The OAL destination then resets the
{TCP,UDP}/IP header checksum. If the OAL destination is also the
final destination, it then delivers the sub-parcels to the network
layer which processes them according to the 5-tuple information
supplied by the original source. If the OAL destination is not the
final destination, it instead forwards each sub-parcel toward the
final destination the same as for an ordinary IP packet as discussed
above.
Note: Adaptation layer parcellation over OMNI links occurs only at
the OAL source while the adaptation layer reunification occurs only
at the OAL destination. Intermediate OAL nodes do not participate in
the parcellation or reunification processes. The OAL destination
should retain sub-parcels in the reunification buffer only for a
short time (e.g., 1 second) or until all sub-parcels of the original
Templin Expires 8 October 2023 [Page 22]
Internet-Draft IP Parcels April 2023
parcel have arrived. The OAL destination may then return incomplete
reunifications to the network layer in cases where loss and/or
delayed delivery interfere with full reunification.
Note: OMNI interface parcellation and reunification is an OAL process
based on the adaptation layer 3-tuple and not the network layer
5-tuple. This is true even if the OAL has visibility into network
layer information since some sub-parcels of the same original parcel
may be forwarded over different network paths.
5.4. Final Destination Restoration/Reunification
When the original source or a router on the path opens a parcel and
forwards its contents as individual IP packets, these packets will
arrive at the final destination which can hold them in a restoration
buffer for a short time and then restore the original parcel using
GRO. The 5-tuple information plus the Identification value provides
sufficient context for GRO restoration which practical
implementations have proven can provide a robust service at high data
rates even for IPv4 with its 16-bit Identification limitation. (For
IPv6, the augmented IPv6 Fragment Header P/S flag and Index values
provide further context - see: Figure 5. Namely, if the P flag is
set, Index contains an ordinal segment index and S is set for all but
the final segment.)
When the original source or a router on the path opens a parcel and
forwards its contents as smaller sub-parcels, these sub-parcels will
arrive at the final destination which can hold them in a
reunification buffer for a short time or until all sub-parcels have
arrived. The 5-tuple information plus the Index, S flag and
Identification values provide sufficient context for reunification,
and both IPv4 and IPv6 will see a full 32-bit Identification.
In both the restoration and reunification cases, the final
destination concatenates segments according to ascending Index
numbers to preserve segment ordering even if a small degree of
reordering and/or loss may have occurred in the networked path. When
the final destination performs restoration/reunification on TCP
segments, it must include the one with any TCP flag bits set as the
first concatenation and with the TCP options including the union of
the TCP options of all concatenated packets or sub-parcels. For both
TCP and UDP, any packet or sub-parcel containing the final segment
must appear as a final concatenation.
The final destination can then present the concatenated parcel
contents to the transport layer with segments arranged in (nearly)
the same order in which they were originally transmitted. Strict
ordering is not required since each segment will include a transport
Templin Expires 8 October 2023 [Page 23]
Internet-Draft IP Parcels April 2023
layer protocol specific start delimiter with positional coordinates.
However, the Index field includes an ordinal value that preserves
ordering since each sub-parcel or individual IP packet contains an
integral number of whole transport layer protocol segments.
Note: Restoration and/or reunification buffer management is based on
a "hold timer" for which it will retain singleton packets or sub-
parcels until all members of the same original parcel have arrived.
It is recommended that implementations set a short hold timer (e.g.,
1 second) and advance any restorations/reunifications to upper layers
when the hold timer expires even if incomplete.
Note: Since loss and/or reordering may occur in the network, the
final destination may receive a "short" packet or sub-parcel with S
set to '0' before all other elements of the same original parcel have
arrived. This condition does not represent an error, but in some
cases may cause the network layer to deliver sub-parcels that are
smaller than the original parcel to the transport layer. The
transport layer simply accepts any segments received from all such
deliveries and will request retransmission of any segments that were
lost and/or damaged.
Note: Restoration and/or reunification buffer congestion may indicate
that the network layer cannot sustain the service(s) at current
arrival rates. The network layer should then begin to deliver
incomplete restorations/reunifications or even individual segments to
the receive queue (e.g., a socket buffer) instead of waiting for all
segments to arrive. The network layer can manage restoration/
reunification buffers, e.g., by maintaining buffer occupancy high/low
watermarks.
5.5. Parcel/Jumbo Reports
When a router or final destination returns a Parcel/Jumbo Report, it
prepares an ICMPv6 PTB message [RFC4443] with Code set to either
"Parcel Report" or "Jumbo Report" (see: [I-D.templin-intarea-omni])
and with MTU set to either the minimum MTU value for a positive
report or to '0' for a negative report. The node then writes its own
IP address as the Parcel/Jumbo Report source and writes the source
address of the packet that invoked the report as the Parcel/Jumbo
Report destination (for IPv4 Parcel Probes, the node writes the
Parcel/Jumbo Report address as an IPv4-Compatible IPv6 address
[RFC4291]). The node next copies as much of the leading portion of
the invoking packet as possible (beginning with the IP header) into
the "packet in error" field without causing the entire Parcel/Jumbo
Report (beginning with the IPv6 header) to exceed 512 octets in
length. The node then sets the Checksum field to 0 instead of
calculating and setting a true checksum.
Templin Expires 8 October 2023 [Page 24]
Internet-Draft IP Parcels April 2023
Since IPv6 packets cannot transit IPv4 paths, and since middleboxes
often filter ICMPv6 messages as they transit IPv6 paths, the node
next wraps the Parcel/Jumbo Report in UDP/IP headers of the correct
IP version with the IP source and destination addresses copied from
the Parcel/Jumbo Report and with UDP port numbers set to the OMNI UDP
port number [I-D.templin-intarea-omni]. The node then calculates and
sets the UDP Checksum (and for IPv4 clears the DF bit). The node
finally sends the prepared Parcel/Jumbo Report to the original source
of the probe.
Note: This implies that original sources that send IP parcels or
advanced jumbos must be capable of accepting and processing these
OMNI protocol UDP messages. A source that sends IP parcels or
advanced jumbos must therefore implement enough of the OMNI interface
to be able to recognize and process these messages.
5.6. Parcel Path Probing
All parcels also serve as implicit probes and may cause either a
router in the path or the final destination to return an ordinary
ICMP error [RFC0792][RFC4443] and/or Packet Too Big (PTB) message
[RFC1191] [RFC8201] concerning the parcel. A router in the path or
the final destination may also return a "Parcel/Jumbo Report"
(subject to rate limiting per [RFC4443]) as discussed in Section 5.5.
To determine whether parcels can transit at least an initial portion
of the forward path toward the final destination, the original source
can also send IP parcels with the Parcel Payload option PMTU field
set to the most significant 31 bits of the next hop link MTU as an
explicit "Parcel Probe". The probe will cause the final destination
or a router on the path to return a Parcel/Jumbo Report or cause the
final destination to return an ordinary data packet with an "IP Jumbo
Reply MTU" option (see: Section 5.5).
A Parcel Probe can be included either in an ordinary data parcel or a
{TCP,UDP}/IP parcel with destination port set to '9' (discard)
[RFC0863]. The probe will still contain a valid {TCP,UDP} parcel
header Checksum that any intermediate hops as well as the final
destination can use to detect mis-delivery, while the final
destination will process any parcel data in probes with correct
Checksums.
Templin Expires 8 October 2023 [Page 25]
Internet-Draft IP Parcels April 2023
If the original source receives a positive Parcel/Jumbo Report or an
ordinary data packet with an IP Jumbo Reply MTU option, it marks the
path as "parcels supported" and ignores any ordinary ICMP and/or PTB
messages concerning the probe. If the original source instead
receives a negative Jumbo Report or no report/reply, it marks the
path as "parcels not supported" and may regard any ordinary ICMP and/
or PTB messages concerning the probe (or its contents) as indications
of a possible path limitation.
The original source can therefore send Parcel Probes in the same IP
parcels used to carry real data. The probes will transit parcel-
capable links joined by routers on the forward path possibly
extending all the way to the destination. If the original source
receives a positive Parcel/Jumbo Report or an ordinary data packet
with an IP Jumbo Reply MTU option, it can continue using IP parcels
after adjusting its segment size if necessary.
The original source sends Parcel Probes unidirectionally in the
forward path toward the final destination to elicit a report/reply,
since it will often be the case that IP parcels are supported only in
the forward path and not in the return path. Parcel Probes may be
dropped in the forward path by any node that does not recognize IP
parcels, but Parcel/Jumbo Reports and/or IP Jumbo Reply MTU options
must be packaged to reduce the risk of return path filtering. For
this reason, the Parcel Payload options included in Parcel Probes and
IP Jumbo Reply MTU options are always packaged as IPv4 header or IPv6
Hop-by-Hop options while Parcel/Jumbo Reports are returned as UDP/IP
encapsulated ICMPv6 PTB messages with a "Parcel/Jumbo Report" Code
value (see: [I-D.templin-intarea-omni]).
Original sources send ordinary parcels or discard parcels as explicit
Parcel Probes by setting the Parcel Payload PMTU to the most-
significant 31 bits of the (non-zero) next hop link MTU. The source
then sets Index, Parcel Payload Length, and {Total, Payload} Length,
then calculates the header and per-segment checksums the same as for
an ordinary parcel. The source finally sends the Parcel Probe via
the outbound IP interface.
Original sources can send Parcel Probes that include a large segment
size, but these may be dropped by a router on the path even if the
next hop link is parcel-capable. The original source may then
receive a Jumbo Report that contains only the MTU of the leading
portion of the path up to the router with the restrictive link. The
original source can instead send Parcel Probes with smaller segments
that would be likely to transit the entire forward path to the final
destination if all links are parcel-capable. For parcel-capable
paths, this may allow the original source to discover both the path
MTU and the MSS in a single message exchange instead of multiple.
Templin Expires 8 October 2023 [Page 26]
Internet-Draft IP Parcels April 2023
According to [RFC7126], IPv4 middleboxes (i.e., routers, security
gateways, firewalls, etc.) that do not observe this specification
should drop IPv4 packets that contain option type '00001011' ("IPv4
Probe MTU") but some might instead either attempt to implement
[RFC1063] or ignore the option altogether. IPv4 middleboxes that
observe this specification instead MUST process the option as an
implicit or explicit Parcel Probe as specified below.
According to [RFC2675], IPv6 middleboxes (i.e., routers, security
gateways, firewalls, etc.) that recognize the IPv6 Jumbo Payload
option but do not observe this specification should return an ICMPv6
Parameter Problem message (and presumably also drop the packet) due
to validation rules for ordinary jumbograms since the parcel includes
a non-zero IP {Total, Payload} Length. IPv6 middleboxes that observe
this specification instead MUST process the option as an implicit or
explicit Parcel Probe as specified below.
When a router that observes this specification receives an IPv4
Parcel Probe it first compares Code with 255 and Check with the IP
header TTL; if either value differs, the router drops the probe and
returns a negative Jumbo Report subject to rate limiting. For all
other IP Parcel Probes, if the next hop link is non-parcel-capable
the router compares PMTU with the next hop link MTU and returns a
positive Parcel Report subject to rate limiting with MTU set to the
minimum value. If the next hop link configures a sufficiently large
MTU, the router then applies packetization to convert the probe into
individual IP packet(s) and forwards each packet to the next hop;
otherwise, it drops the probe.
If the next hop link both supports parcels and configures an MTU that
is large enough to pass the probe, the router instead compares the
probe PMTU with the next hop link MTU and MUST (re)set PMTU to the
most-significant 31 bits of the minimum value then forward the probe
to the next hop (and for IPv4 first reset Check to the same value
that will appear in the IPv4 header TTL upon transmission to the next
hop). If the next hop link supports parcels but configures an MTU
that is too small to pass the probe, the router then applies
parcellation to break the probe into multiple smaller sub-parcels
that can transit the link. In the process, the router sets PMTU to
the most significant 31 bits of the minimum link MTU value in the
first sub-parcel and sets PMTU to 0 in all non-first sub-parcels (and
for IPv4 resets Check in all sub-parcels). If the next hop link
supports parcels but configures an MTU that is too small to pass a
singleton sub-parcel of the probe, the router instead drops the probe
and returns a positive Jumbo Report subject to rate limiting with MTU
set to the next hop link MTU.
Templin Expires 8 October 2023 [Page 27]
Internet-Draft IP Parcels April 2023
The final destination may therefore receive one or more individual IP
packets or sub-parcels including an intact Parcel Probe. If the
final destination receives individual IP packets, it performs any
necessary integrity checks, applies restoration/GRO if possible then
delivers the (restored) parcel contents to the transport layer. If
the final destination receives an IPv4 Parcel Probe, it first
compares Code with 255 and Check with the IPv4 header TTL; if either
value differs, the final destination drops the probe and returns a
negative Jumbo Report. For all other Parcel Probes, if the {TCP,UDP}
port number is '9' ("discard") the final destination instead returns
a positive Jumbo Report and discards the probe and any of its
associated sub-parcels without applying reunification.
If the final destination receives a Parcel Probe (plus any of its
associated sub-parcels) for any other {TCP,UDP} port number, it
applies reunification and delivers the (reunified) parcel contents to
the transport layer. The destination then arranges to include an IP
Jumbo Reply MTU option in a return data packet/parcel associated with
the flow according to the format shown in Figure 6:
IPv4 Jumbo Reply MTU Option Format
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Option Type | Opt Data Len | Rtn-PMTU |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Path MTU (PMTU) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
IPv6 Jumbo Reply MTU Option Format
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Option Type | Opt Data Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Min-PMTU | Rtn-PMTU |0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Path MTU (PMTU) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6: IP Jumbo Reply MTU Option
For IPv4, the destination sets Option Type to '00001100' and Option
Data Length to '00010000'. The destination then sets Rtn-PMTU to the
minimum of 65535 and the value that will appear in the PMTU field.
Templin Expires 8 October 2023 [Page 28]
Internet-Draft IP Parcels April 2023
For IPv6, the destination sets Option Type to '00110000' and Option
Data Length to '00001100'. The destination then sets Min-PMTU to the
minimum of 65535 and the outgoing link MTU and sets Rtn-PMTU to the
most significant 15 bits of the minimum of 65535 and the value that
will appear in PMTU.
For both IP protocol versions, the destination finally sets the
Identification and Path MTU fields to the values received in the
Parcel Probe, then sets other "unused" fields to 0. Note that the
Option Data Length differentiates the options from the "short" forms
of the same Option Types that appear in [RFC1063] and [RFC9268].
After sending Parcel Probes (or ordinary parcels) the original source
may therefore receive UDP/IP encapsulated Parcel/Jumbo Reports,
ordinary data packets with IP Jumbo Reply MTU options, and/or
transport layer protocol probe replies. If the source receives a
Parcel/Jumbo Report, it verifies the UDP Checksum then verifies that
the ICMPv6 Checksum is 0. If both Checksum values are correct, the
node then matches the enclosed PTB message with an original probe/
parcel by examining the ICMPv6 "packet in error" containing the
leading portion of the invoking packet. If the "packet in error"
does not match one of its previous packets, the source discards the
Parcel/Jumbo Report; otherwise, it continues to process.
If the source receives a Parcel/Jumbo Report with MTU '0', it marks
the path as "parcels not supported"; otherwise, it marks the path as
"parcels supported" and also records the MTU value as the parcel path
MTU (i.e., the portion of the path up to and including the node that
returned the Parcel/Jumbo Report). If the MTU value is 65535 (plus
headers) or larger, the MTU determines the largest whole parcel that
can transit the path without packetization/parcellation while using
any segment size up to and including the maximum. For Reports that
include a smaller MTU, the value represents both the largest whole
parcel size and a maximum segment size limitation. In that case, the
maximum parcel size that can transit the initial portion of the path
may be larger than the maximum segment size that can continue to
transit the remaining path to the final destination.
If the source receives an ordinary data packet for the flow that
includes an IP Jumbo Reply MTU option, it examines the Identification
to ensure that the reply matches one of the Parcel Probes it
previously sent for this same data flow. It then records the PMTU
value as the parcel/jumbo path MTU for this flow and marks the path
as "parcels and jumbos supported".
For further discussion on parcel/jumbo probing alternatives, see:
Appendix C.
Templin Expires 8 October 2023 [Page 29]
Internet-Draft IP Parcels April 2023
5.7. Integrity
The {TCP,UDP}/IP header plus each segment of a (multi-segment) IP
parcel includes its own integrity check. This means that IP parcels
can support stronger and more discrete integrity checks for the same
amount of transport layer protocol data compared to an individual IP
packet or jumbogram. The {TCP/UDP} Checksum header integrity check
can be verified at each hop to ensure that parcels with errored
headers are detected. The per-segment Integrity Block Checksums are
set by the source and verified by the final destination, noting that
TCP parcels must honor the sequence number discipline discussed in
Section 4.1.
IP parcels can range in length from as small as only the {TCP,UDP}/IP
headers plus a single Integrity Block Checksum with a single segment
to as large as the headers plus (256 * 65535) octets. Although link
layer integrity checks such as CRC-32 provide sufficient protection
for contiguous data blocks up to approximately 9KB, reliance on link-
layer integrity checks may be inadvisable for links with
significantly larger MTUs and may not be possible at all for links
such as tunnels over IPv4 that invoke fragmentation. Moreover, the
segment contents of a received parcel may arrive in an incomplete
and/or rearranged order with respect to their original packaging.
Each network layer forwarding hop as well as the final destination
should verify the {TCP,UDP}/IP Checksum at its layer, since an
errored header could result in mis-delivery. If a network layer
protocol entity on the path detects an incorrect {TCP,UDP}/IP
Checksum it should discard the entire IP parcel unless the header(s)
can somehow first be repaired by lower layers.
To support the parcel header checksum calculation, the network layer
uses modified versions of the {TCP,UDP}/IPv4 "pseudo-header" found in
[RFC0768][RFC9293], or the {TCP,UDP}/IPv6 "pseudo-header" found in
Section 8.1 of [RFC8200]. Note that while the contents of the two IP
protocol version-specific pseudo-headers beyond the address fields
are the same, the order in which the contents are arranged differs
and must be honored according to the specific IP protocol version as
shown in Figure 7. This allows for maximum reuse of widely deployed
code while ensuring interoperability.
Templin Expires 8 October 2023 [Page 30]
Internet-Draft IP Parcels April 2023
IPv4 Parcel Pseudo-Header
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IPv4 Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IPv4 Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| zero | Next Header | Segment Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Index | Parcel Payload Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
IPv6 Parcel Pseudo-Header
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
~ IPv6 Source Address ~
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
~ IPv6 Destination Address ~
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Index | Parcel Payload Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Segment Length | zero | Next Header |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 7: {TCP,UDP}/IP Parcel Pseudo-Header Formats
where the following fields appear in both pseudo-headers:
* Source Address is the 4-octet IPv4 or 16-octet IPv6 source address
of the prepared parcel.
* Destination Address is the 4-octet IPv4 or 16-octet IPv6
destination address of the prepared parcel.
* zero encodes the constant value '0'.
* Next Header is the IP protocol number corresponding to the
transport layer protocol, i.e., TCP or UDP.
* Segment Length is the value that appears in the IP {Total,
Payload} Length field of the prepared parcel.
* Index is the 1-octet value that appears in the Parcel Payload
Option field of the same name.
Templin Expires 8 October 2023 [Page 31]
Internet-Draft IP Parcels April 2023
* Parcel Payload Length is the 3-octet value that appears in the
Parcel Payload Option field of the same name.
Transport layer protocol entities coordinate per-segment checksum
processing with the network layer using a control mechanism such as a
socket option. If the transport layer sets a SO_NO_CHECK(TX) socket
option, the transport layer is responsible for supplying per-segment
checksums on transmission and the network layer forwards the IP
parcel to the next hop without further processing; otherwise, the
network layer supplies the per-segment checksums before forwarding.
If the transport layer sets a SO_NO_CHECK(RX) socket option, the
transport layer is responsible for verifying per-segment checksums on
reception and the network layer delivers each received parcel body to
the transport layer without further processing; otherwise, the
network layer verifies the per-segment parcel checksums before
delivering.
When the transport layer protocol entity of the source delivers a
parcel body to the network layer, it prepends an Integrity Block of
(J + 1) 2-octet Checksum fields and includes a 4-octet Sequence
Number field with each TCP non-first segment. If the SO_NO_CHECK(TX)
socket option is set, the transport layer protocol either calculates
each segment checksum and writes the value into the corresponding
Checksum field (and for UDP with '0' values written as 'ffff') or
writes the value '0' to disable specific UDP segment checksums. If
the SO_NO_CHECK(TX) socket options is clear, for UDP the transport
layer instead writes the value '0' to disable or any non-zero value
to enable checksums for specific segments (for TCP, the transport
layer instead writes any value).
When the network layer of the source accepts the parcel body from the
transport layer protocol entity, if the SO_NO_CHECK(TX) socket option
is set the network layer appends the {TCP,UDP}/IP headers and
forwards the parcel to the next hop without further processing. If
the SO_NO_CHECK(TX) socket option is clear, the network layer instead
calculates the checksum for each TCP segment (or each UDP segment
with a non-zero value in the corresponding Integrity Block Checksum
field) and overwrites the calculated value into the Checksum field
(and for UDP with '0' values written as 'ffff').
When the network layer of the destination receives a parcel from the
source, if the SO_NO_CHECK(RX) socket option is set the network layer
delivers the parcel body to the transport layer protocol entity
without further processing, and the transport layer is responsible
for per-segment checksum verification. If the SO_NO_CHECK(RX) socket
option is clear, the network layer instead verifies the checksum for
each TCP segment (or each UDP segment with a non-zero value in the
corresponding Integrity Block Checksum field) and marks a
Templin Expires 8 October 2023 [Page 32]
Internet-Draft IP Parcels April 2023
corresponding flag for the segment in an ancillary data structure as
either "correct" or "incorrect". (For UDP, if the Checksum is '0'
the network layer unconditionally marks the segment as "correct".)
The network layer then delivers both the parcel body (beginning with
the Integrity block) and ancillary data to the transport layer which
can then determine which segments have correct/incorrect checksums.
Note: The Integrity Block itself is intentionally omitted from the IP
Parcel {TCP,UDP} header checksum calculation. This permits
destinations to accept as many intact segments as possible from
received parcels with checksum block bit errors, whereas the entire
parcel would need to be discarded if the header checksum also covered
the Integrity Block.
6. Advanced Jumbos
This specification introduces an IP "advanced jumbo" service as an
alternative to basic IPv6 jumbograms that also includes a path
probing function based on the mechanisms specified in Section 5.6.
The function employs an "Advanced Jumbo Option" with the same Option
Type and Option Data Length values as for the Parcel Payload option,
but with the Index and Parcel Payload Length fields converted to a
single 32-bit Jumbo Payload Length field as shown in Figure 8:
IPv4 Advanced Jumbo Option Format
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Option Type | Opt Data Len | Code | Check |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Jumbo Payload Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Path MTU (PMTU) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
IPv6 Advanced Jumbo Option Format
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Option Type | Opt Data Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Jumbo Payload Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Path MTU (PMTU) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 8: Advanced Jumbo Option
Templin Expires 8 October 2023 [Page 33]
Internet-Draft IP Parcels April 2023
The source prepares an advanced jumbo by first setting the IP {Total,
Payload} Length field to the special "Type" value '1' to distinguish
this from a basic jumbogram or parcel. The source can begin by
sending a "Jumbo Probe" to pre-qualify the path for advanced jumbos
if necessary.
To prepare a Jumbo Probe that will trigger a Jumbo Report, the source
can set {Protocol, Next Header} to {TCP,UDP}, set the {TCP,UDP} port
to '9' (discard) and either include no octets beyond the {TCP,UDP}
header or a single discard segment of the desired probe size
immediately following the header and with no Integrity Block
included. (The source can instead set the {TCP,UDP} port to the port
number for a current data flow in order to receive IP Jumbo Reply MTU
options in return packets as discussed in Section 5.6.) The source
then sets Jumbo Payload Length to the length of the {TCP,UDP} header
plus the length of the discard segment plus the length of the full IP
header for IPv4 or the extension headers for IPv6.
The source next sets Identification the same as for an IP Parcel
Probe, sets the Jumbo Probe PMTU to the next hop link MTU. For IPv4,
the source also sets Code to 255 and Check to the next hop TTL. The
source then calculates the {TCP,UDP} Checksum based on the same
pseudo header as for an ordinary parcel (see: Figure 7) but with the
Index and Parcel Payload Length fields replaced with a 32-bit Jumbo
Payload Length field and with the Segment Length replaced with the
Type value '1'. The source then calculates the checksum over the
pseudo header then continues the calculation over the entire length
of the probe segment. The source then sends the Jumbo Probe via the
next hop link toward the final destination.
At each IPv4 forwarding hop, the router examines Code and Check and
drops the probe and returns a negative Jumbo Report if either value
is incorrect. If both values are correct, and if the next hop link
is jumbo-capable, the router compares PMTU to the next hop link MTU,
resets PMTU to the minimum value (and for IPv4 sets Check to the next
hop TTL) then forwards the probe to the next hop. If the next hop
link is not jumbo-capable, the router instead drops the probe and
returns a negative Jumbo Report.
If the Jumbo Probe encounters an OMNI link, the OAL source can either
drop the probe and return a negative Jumbo Report or forward the
probe further toward the OAL destination using adaptation layer
encapsulation. If the OAL source already knows the OAL path MTU for
this OAL destination, it can encapsulate and forward the Jumbo Probe
with PMTU set to the minimum of itself and the known value (minus the
adaptation layer header size), and without adding any padding octets.
Templin Expires 8 October 2023 [Page 34]
Internet-Draft IP Parcels April 2023
If the OAL path MTU is unknown, the OAL source can instead
encapsulate the Jumbo Probe in an adaptation layer IPv6 header with a
Jumbo Payload option and with NULL padding octets added beyond the
end of the encapsulated Jumbo Probe to form an adaptation layer
jumbogram no larger than the minimum of PMTU and (2**24 - 1) octets
(minus the adaptation layer header size) as a form of "jumbo-in-
jumbo" encapsulation.
The OAL source then writes this size into the Jumbo Probe PMTU field
and forwards the newly-created adaptation layer jumbogram toward the
OAL destination. If the jumbogram somehow transits the path, the OAL
destination then removes the adaptation layer encapsulation, discards
the padding, then forwards the probe onward toward the final
destination (with each hop reducing PMTU if necessary).
When a router on the path forwards a Jumbo Probe, it drops and
returns a Jumbo Report if the next hop MTU is insufficient;
otherwise, it forwards to the next hop toward the final destination.
When the final destination receives the Jumbo Probe, it returns a
Jumbo Report with the PMTU set to the maximum-sized jumbo that can
transit the path.
When the Jumbo Probe reaches the final destination, the destination
first examines the {TCP,UDP} port number. If the port number is
"discard", the destination returns a Jumbo Report UDP message;
otherwise, the destination prepares an IP Jumbo Reply MTU option to
include on a data packet on the return path to the original source.
Detailed descriptions for these processes are found in Section 5.6.
After successfully probing the path, the original source can begin
sending regular advanced jumbos by setting the IP {Total, Payload}
Length field to the special Type value '1', setting PMTU to 0, then
calculating the Checksum the same as described for probes above.
When the final destination receives an advanced jumbo, it first
verifies the Checksum then delivers the data to the transport layer
without returning a Jumbo Report. The source can continue to send
advanced jumbos into the path with the possibility that the path may
change. In that case, a router in the network may return an ICMP
error, an ICMPv6 PTB, or a Jumbo Report if the path MTU decreases.
Note: If the OAL source can in some way determine that a very large
packet is likely to transit the OAL path, it can encapsulate a Jumbo
Probe to form an adaptation layer jumbogram larger than (2**24 - 1)
octets with the understanding that the time required to transit the
path determines acceptable jumbogram sizes.
Templin Expires 8 October 2023 [Page 35]
Internet-Draft IP Parcels April 2023
Note: The Jumbo Report message types returned in response to both
Parcel and Jumbo Probes are one and the same, and signify that both
parcels and advanced jumbos at least as large as the reported MTU can
transit the path.
7. Minimal IP Parcels and Jumbograms
Minimal IP parcels and advanced jumbos are distinguished from regular
parcels and advanced jumbos by including the same Option Type value
as specified above, but with an Option Data Length of '00000100' for
IPv6 of '00001000' for IPv4. These minimal forms provide the benefit
of reducing the IP option length by 8 octets at the expense of
omitting the Identification and PMTU values.
Minimal advanced jumbos also include a Type value of '1' in the IP
{Total, Payload} Length field, while basic IPv6 jumbograms with
Payload Length of 0 are processed per [RFC2675]. (IPv4 packets with
Total Length of 0 are undefined and must be dropped.)
The option formats for IPv4 are shown in Figure 9 and the option
formats for IPv6 are shown in Figure 10.
Minimal IPv4 Parcel Format
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Option Type | Opt Data Len | Code | Check |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Index | Parcel Payload Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Minimal IPv4 Jumbogram Format
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Option Type | Opt Data Len | Code | Check |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Jumbo Payload Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 9: Minimal Parcel/Jumbogram for IPv4
Templin Expires 8 October 2023 [Page 36]
Internet-Draft IP Parcels April 2023
Minimal IPv6 Parcel Format
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Option Type | Opt Data Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Index | Parcel Payload Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Minimal IPv6 Jumbogram Format
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Option Type | Opt Data Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Jumbo Payload Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 10: Minimal Parcel/Jumbogram for IPv6
The original source can send minimal parcels or advanced jumbos after
successfully probing a path to confirm that it can transit a given
size over its entire length to the final destination. Minimal
parcels and advanced jumbos use a reduced-length IP option that omits
the Identification and Path MTU fields and therefore cannot transit a
router that performs packetization/parcellation.
End systems and routers process minimal parcels the same as for
expanded parcels as specified in previous sections. If a router
needs to drop a minimal parcel, it returns a Parcel/Jumbo Report
(subject to rate limiting) the same as for an expanded parcel, noting
that the encapsulated parcel body will not contain an Identification
and Path MTU field.
End systems and routers process minimal advanced jumbos with Type
value '1' in the IP {Total, Payload} Length field the same as for
expanded advanced jumbos as specified in Section 6. If a router
needs to drop a minimal advanced jumbo, it returns a Jumbo Report
(subject to rate limiting) the same as for an expanded advanced
jumbo.
End systems and routers process basic IPv6 jumbograms with the value
'0' in the IPv6 payload length field the same as specified in
[RFC2675]. End systems and routers silently discard all IPv4
jumbograms with the value '0' in the IPv4 Total Length field, as no
basic IPv4 jumbogram service is defined for IPv4.
Templin Expires 8 October 2023 [Page 37]
Internet-Draft IP Parcels April 2023
Note: If the path changes, routers in the path may cease forwarding
minimal parcels and advanced jumbos and begin returning ICMP errors,
ICMP PTBs and/or Parcel/Jumbo Reports. According to the network
trust model, the original source may then elect to re-probe to
determine whether the path MTU has been reduced and/or whether the
path can still support parcels/jumbos at all.
8. Implementation Status
Common widely-deployed implementations include services such as TCP
Segmentation Offload (TSO) and Generic Segmentation/Receive Offload
(GSO/GRO). These services support a robust service that has been
shown to improve performance in many instances.
UDP/IPv4 parcels have been implemented in the linux-5.10.67 kernel
and ION-DTN ion-open-source-4.1.0 source distributions. Patch
distribution found at: "https://github.com/fltemplin/ip-parcels.git".
Performance analysis with a single-threaded receiver has shown that
including increasing numbers of segments in a single parcel produces
measurable performance gains over fewer numbers of segments due to
more efficient packaging and reduced system calls/interrupts. For
example, sending parcels with 30 2000-octet segments shows a 48%
performance increase in comparison with ordinary IP packets with a
single 2000-octet segment.
Since performance is strongly bounded by single-segment receiver
processing time (with larger segments producing dramatic performance
increases), it is expected that parcels with increasing numbers of
segments will provide a performance multiplier on multi-threaded
receivers in parallel processing environments.
9. IANA Considerations
The IANA is instructed to change the "MTUP - MTU Probe" entry in the
'ip option numbers' registry to the "JUMBO - IPv4 Jumbo Payload"
option. The Copy and Class fields must both be set to 0, and the
Number and Value fields must both be set to '11'. The reference must
be changed to this document [RFCXXXX].
The IANA is instructed to create and maintain a new registry entitled
"IP Jumbogram Types". For IP packets that include a Jumbo Payload
Option, the IP {Total, Header} Length field encodes a "Jumbo Type"
value instead of an ordinary length. Initial values are given below:
Templin Expires 8 October 2023 [Page 38]
Internet-Draft IP Parcels April 2023
Value Jumbo Type Reference
----- ------------- ----------
0 Basic Jumbogram (IPv6 only) [RFC2675]
1 Advanced Jumbo [RFCXXXX]
2-509 Unassigned [RFCXXXX]
510 Reserved for Experimentation [RFCXXXX]
511 Reserved by IANA [RFCXXXX]
512-65535 IP Parcel [RFCXXXX]
Figure 11: IP Jumbogram Types
10. Security Considerations
In the control plane, original sources match any identifying
information in received Parcel/Jumbo Reports and IP Jumbo Reply MTU
options with their corresponding probes. If the information matches,
the report is likely authentic. In environments where stronger
authentication is necessary, nodes that send Parcel and/or Jumbo
Reports can apply the message authentication services specified for
AERO/OMNI.
In the data plane, multi-layer security solutions may be needed to
ensure confidentiality, integrity and availability. Since parcels
and advanced jumbos are defined only for TCP and UDP, IPsec-AH/ESP
[RFC4301] cannot be applied in transport mode although they can
certainly be used in tunnel mode at lower layers such as for
transmission of parcels and advanced jumbos over OMNI link secured
spanning trees, VPNs, etc. Since the network layer does not
manipulate transport layer segments, parcels and advanced jumbos do
not interfere with transport or higher-layer security services such
as (D)TLS/SSL [RFC8446] which may provide greater flexibility in some
environments.
Further security considerations related to IP parcels are found in
the AERO/OMNI specifications.
11. Acknowledgements
This work was inspired by ongoing AERO/OMNI/DTN investigations. The
concepts were further motivated through discussions with colleagues.
A considerable body of work over recent years has produced useful
"segmentation offload" facilities available in widely-deployed
implementations.
With the advent of networked storage, big data, streaming media and
other high data rate uses the early days of Internetworking have
evolved to accommodate the need for improved performance. The need
Templin Expires 8 October 2023 [Page 39]
Internet-Draft IP Parcels April 2023
fostered a concerted effort in the industry to pursue performance
optimizations at all layers that continues in the modern era. All
who supported and continue to support advances in Internetworking
performance are acknowledged.
The following individuals are acknowledged for their contributions:
Scott Burleigh, Madhuri Madhava Badgandi, Bhargava Raman Sai Prakash.
12. References
12.1. Normative References
[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768,
DOI 10.17487/RFC0768, August 1980,
<https://www.rfc-editor.org/info/rfc768>.
[RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791,
DOI 10.17487/RFC0791, September 1981,
<https://www.rfc-editor.org/info/rfc791>.
[RFC0792] Postel, J., "Internet Control Message Protocol", STD 5,
RFC 792, DOI 10.17487/RFC0792, September 1981,
<https://www.rfc-editor.org/info/rfc792>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC2675] Borman, D., Deering, S., and R. Hinden, "IPv6 Jumbograms",
RFC 2675, DOI 10.17487/RFC2675, August 1999,
<https://www.rfc-editor.org/info/rfc2675>.
[RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing
Architecture", RFC 4291, DOI 10.17487/RFC4291, February
2006, <https://www.rfc-editor.org/info/rfc4291>.
[RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet
Control Message Protocol (ICMPv6) for the Internet
Protocol Version 6 (IPv6) Specification", STD 89,
RFC 4443, DOI 10.17487/RFC4443, March 2006,
<https://www.rfc-editor.org/info/rfc4443>.
[RFC7323] Borman, D., Braden, B., Jacobson, V., and R.
Scheffenegger, Ed., "TCP Extensions for High Performance",
RFC 7323, DOI 10.17487/RFC7323, September 2014,
<https://www.rfc-editor.org/info/rfc7323>.
Templin Expires 8 October 2023 [Page 40]
Internet-Draft IP Parcels April 2023
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6
(IPv6) Specification", STD 86, RFC 8200,
DOI 10.17487/RFC8200, July 2017,
<https://www.rfc-editor.org/info/rfc8200>.
[RFC9293] Eddy, W., Ed., "Transmission Control Protocol (TCP)",
STD 7, RFC 9293, DOI 10.17487/RFC9293, August 2022,
<https://www.rfc-editor.org/info/rfc9293>.
12.2. Informative References
[BIG-TCP] Dumazet, E., "BIG TCP, Netdev 0x15 Conference (virtual),
https://netdevconf.info/0x15/session.html?BIG-TCP", 31
August 2021.
[I-D.ietf-6man-hbh-processing]
Hinden, R. M. and G. Fairhurst, "IPv6 Hop-by-Hop Options
Processing Procedures", Work in Progress, Internet-Draft,
draft-ietf-6man-hbh-processing-06, 11 March 2023,
<https://datatracker.ietf.org/doc/html/draft-ietf-6man-
hbh-processing-06>.
[I-D.templin-dtn-ltpfrag]
Templin, F., "LTP Fragmentation", Work in Progress,
Internet-Draft, draft-templin-dtn-ltpfrag-09, 25 July
2022, <https://datatracker.ietf.org/doc/html/draft-
templin-dtn-ltpfrag-09>.
[I-D.templin-intarea-aero]
Templin, F., "Automatic Extended Route Optimization
(AERO)", Work in Progress, Internet-Draft, draft-templin-
intarea-aero-27, 23 February 2023,
<https://datatracker.ietf.org/doc/html/draft-templin-
intarea-aero-27>.
[I-D.templin-intarea-omni]
Templin, F., "Transmission of IP Packets over Overlay
Multilink Network (OMNI) Interfaces", Work in Progress,
Internet-Draft, draft-templin-intarea-omni-27, 23 February
2023, <https://datatracker.ietf.org/doc/html/draft-
templin-intarea-omni-27>.
Templin Expires 8 October 2023 [Page 41]
Internet-Draft IP Parcels April 2023
[QUIC] Ghedini, A., "Accelerating UDP packet transmission for
QUIC, https://blog.cloudflare.com/accelerating-udp-packet-
transmission-for-quic/", 8 January 2020.
[RFC0863] Postel, J., "Discard Protocol", STD 21, RFC 863,
DOI 10.17487/RFC0863, May 1983,
<https://www.rfc-editor.org/info/rfc863>.
[RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP
MTU discovery options", RFC 1063, DOI 10.17487/RFC1063,
July 1988, <https://www.rfc-editor.org/info/rfc1063>.
[RFC1071] Braden, R., Borman, D., and C. Partridge, "Computing the
Internet checksum", RFC 1071, DOI 10.17487/RFC1071,
September 1988, <https://www.rfc-editor.org/info/rfc1071>.
[RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
DOI 10.17487/RFC1191, November 1990,
<https://www.rfc-editor.org/info/rfc1191>.
[RFC4301] Kent, S. and K. Seo, "Security Architecture for the
Internet Protocol", RFC 4301, DOI 10.17487/RFC4301,
December 2005, <https://www.rfc-editor.org/info/rfc4301>.
[RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU
Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007,
<https://www.rfc-editor.org/info/rfc4821>.
[RFC5326] Ramadas, M., Burleigh, S., and S. Farrell, "Licklider
Transmission Protocol - Specification", RFC 5326,
DOI 10.17487/RFC5326, September 2008,
<https://www.rfc-editor.org/info/rfc5326>.
[RFC7126] Gont, F., Atkinson, R., and C. Pignataro, "Recommendations
on Filtering of IPv4 Packets Containing IPv4 Options",
BCP 186, RFC 7126, DOI 10.17487/RFC7126, February 2014,
<https://www.rfc-editor.org/info/rfc7126>.
[RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed.,
"Path MTU Discovery for IP version 6", STD 87, RFC 8201,
DOI 10.17487/RFC8201, July 2017,
<https://www.rfc-editor.org/info/rfc8201>.
[RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol
Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
<https://www.rfc-editor.org/info/rfc8446>.
Templin Expires 8 October 2023 [Page 42]
Internet-Draft IP Parcels April 2023
[RFC8899] Fairhurst, G., Jones, T., Tüxen, M., Rüngeler, I., and T.
Völker, "Packetization Layer Path MTU Discovery for
Datagram Transports", RFC 8899, DOI 10.17487/RFC8899,
September 2020, <https://www.rfc-editor.org/info/rfc8899>.
[RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
Multiplexed and Secure Transport", RFC 9000,
DOI 10.17487/RFC9000, May 2021,
<https://www.rfc-editor.org/info/rfc9000>.
[RFC9171] Burleigh, S., Fall, K., and E. Birrane, III, "Bundle
Protocol Version 7", RFC 9171, DOI 10.17487/RFC9171,
January 2022, <https://www.rfc-editor.org/info/rfc9171>.
[RFC9268] Hinden, R. and G. Fairhurst, "IPv6 Minimum Path MTU Hop-
by-Hop Option", RFC 9268, DOI 10.17487/RFC9268, August
2022, <https://www.rfc-editor.org/info/rfc9268>.
Appendix A. TCP Extensions for High Performance
TCP Extensions for High Performance are specified in [RFC7323], which
updates earlier work that began in the late 1980's and early 1990's.
These efforts determined that the TCP 16-bit Window was too small to
accommodate sustained transmission at high data rates and devised a
TCP Window Scale option to allow window sizes up to 2^30. The work
also defined a Timestamp option used for round-trip time measurements
and as a Protection Against Wrapped Sequences (PAWS) at high data
rates. TCP users of IP parcels are strongly encouraged to adopt
these measures.
Since TCP/IP parcels only include control bits for the first segment
("segment(0)"), nodes must regard all other segments of the same
parcel as data segments. When a node breaks a TCP/IP parcel out into
individual packets or sub-parcels, only the first packet/sub-parcel
contains the original segment(0) and therefore only its TCP header
retains the control bit settings from the original parcel TCP header.
If the original TCP header included TCP options such as Maximum
Segment Size (MSS), Window Scale (WS) and/or Timestamp, the node
copies those same options into the options section of the new TCP
header.
For all other packets/sub-parcels, the note sets all TCP header
control bits to '0' as data segment(s). Then, if the original parcel
contained a Timestamp option, the node copies the Timestamp option
into the options section of the new TCP header. Appendix A of
[RFC7323] provides implementation guidelines for the Timestamp option
layout.
Templin Expires 8 October 2023 [Page 43]
Internet-Draft IP Parcels April 2023
Appendix A of [RFC7323] also discusses Interactions with the TCP
Urgent Pointer as follows: "if the Urgent Pointer points beyond the
end of the TCP data in the current segment, then the user will remain
in urgent mode until the next TCP segment arrives. That segment will
update the Urgent Pointer to a new offset, and the user will never
have left urgent mode". In the case of IP parcels, however, it will
often be the case that the "next TCP segment" is included in the same
(sub-)parcel as the segment that contained the urgent pointer such
that the urgent pointer can be updated immediately.
Finally, if the parcel contains more than 65535 octets of data (i.e.,
spread across multiple segments), then the Urgent Pointer can be
regarded in the same manner as for jumbograms as described in
Section 5.2 of [RFC2675].
Appendix B. Extreme L Value Implications
For each parcel, the transport layer can specify any L value between
512 and 65535 octets. Transport protocols that send isolated control
and/or data segments smaller than 512 octets should package them as
ordinary packets or as the final segment of a parcel. Transport
protocol streams therefore often include a mix of (larger) parcels
and (smaller) ordinary packets.
The transport layer should also specify an L value no larger than can
accommodate the maximum-sized transport and network layer headers
that the source will include without causing a single segment plus
headers to exceed 65535 octets. For example, if the source will
include a 28 octet TCP header plus a 40 octet IPv6 header with 24
extension header octets (plus a 2 octet per-segment checksum) the
transport should specify an L value no larger than (65535 - 28 - 40 -
24 - 2) = 65441 octets.
The transport can specify still larger "extreme" L values up to 65535
octets, but the resulting parcels might be lost along some paths with
unpredictable results. For example, a parcel with an extreme L value
set as large as 65535 might be able to transit paths that can pass
jumbograms natively but might not be able to transit a path that
includes non-jumbo links. The transport layer should therefore
carefully consider the benefits of constructing parcels with extreme
L values larger than the recommended maximum due to high risk of loss
compared with only minor potential performance benefits.
Templin Expires 8 October 2023 [Page 44]
Internet-Draft IP Parcels April 2023
Parcels that include extreme L values larger than the recommended
maximum and with a maximum number of included segments could also
cause a parcel to exceed 16,777,215 (2**24 - 1) octets in total
length. Since the Parcel Payload Length field is limited to 24 bits,
however, the largest possible parcel is also limited by this size.
See also the above risk/benefit analysis for parcels that include
extreme L values larger than the recommended maximum.
Appendix C. Additional Parcel/Jumbo Probe Considerations
When the source sends a Parcel/Jumbo Probe, it sets the PMTU field to
the most significant 31 bits of the MTU of the next hop link and each
hop along the way may further reduce this size. This may cause the
source to underestimate the path MTU by at most one octet. The
source can then use "common sense" to determine whether the MTU was
underestimated; for example, if the reported MTU is 65534 it is very
likely that packets of length 65535 would also transit the link due
to the "one less than all ones" binary length. But, if the reported
MTU was 1500 it is very unlikely that packets of length 1501 would
transit the link under similar logic.
After sending a Parcel/Jumbo Probe, the source may receive a Parcel/
Jumbo Report from either a router on the path or from the final
destination itself. Alternatively, the source can shape its probes
to request IP Jumbo Reply MTU options carried by ordinary data
packets on the return path from the destination.
If a router or final destination receives a Parcel/Jumbo Probe but
does not recognize the parcel/jumbo constructs, it will likely drop
the probe without further processing and may return an ICMP error.
The original source will then consider the probe as lost, but may
attempt to probe again later, e.g., in case the path may have
changed.
When the source examines the "packet in error" portion of a Parcel/
Jumbo Report, it can easily match the Report against its recent
transmissions if the Identification value is available. For "packets
in error" that do not include an Identification, the source can
attempt to match based on any other identifying information
available; otherwise, it should discard the message.
Templin Expires 8 October 2023 [Page 45]
Internet-Draft IP Parcels April 2023
If the source receives multiple Parcel/Jumbo Reports for a single
parcel/jumbo sent into a given path, it should prefer any information
reported by the final destination over information reported by a
router. For example, if a router returns a negative report while the
destination returns a positive report the latter should be considered
as more-authoritative. For this reason, the source should provide a
configuration knob allowing it to accept or ignore reports that
originate from routers, e.g., according to the network trust model.
When a destination returns a Parcel/Jumbo Report, it can optionally
"pair" the report with an ordinary data packet that it returns to the
original source. For example, the OMNI specification includes a
"super-packet" service that allows multiple independent IP packets to
be encapsulated as a single adaptation layer packet. This is
distinct from an IP parcel in that each packet member of the super-
packet includes its own IP (and possibly other upper layer) header.
A source can request to receive two different types of parcel/jumbo
path MTU feedback from the destination - a UDP encapsulated Parcel/
Jumbo Report in response to a probe sent to the "discard" port, or an
ordinary data packet with an IP Jumbo Reply MTU option in response to
a probe sent into an ordinary transport layer protocol flow. In some
environments, one or both of these MTU feedback types may be
erroneously dropped by a router along the return path. The source
may therefore attempt to probe first using "method A", and then try
again using "method B", e.g., if there is no response. In
environments where ongoing transport protocol sessions are
established, it is recommended that the source engage the IP Jumbo
Reply MTU option as "method A".
Appendix D. IP Parcel and Advanced Jumbo Futures
Both historic and modern-day data links configure Maximum
Transmission Units (MTUs) that are far smaller than the desired state
for Internetworking futures. When the first Ethernet data links were
deployed many decades ago, their 1500 octet MTU set a strong
precedent that was widely adopted. This same size now appears as the
predominant MTU limit for most paths in the Internet today, although
modern link deployments with MTUs as large as 9KB have begun to
emerge.
In the late 1980's, the Fiber Distributed Data Interface (FDDI)
standard defined a new link type with MTU slightly larger than 4500
octets. The goal of the larger MTU was to increase performance by a
factor of 10 over the ubiquitous 10Mbps and 1500-octet MTU Ethernet
technologies of the time. Many factors including a failure to
harmonize MTU diversity and an Ethernet performance increase to
100Mbps led to poor FDDI market reception. In the next decade, the
Templin Expires 8 October 2023 [Page 46]
Internet-Draft IP Parcels April 2023
1990's saw new initiatives including ATM/AAL5 (9KB MTU) and HiPPI
(64KB MTU) which offered high-speed data link alternatives with
larger MTUs but again the inability to harmonize diversity derailed
their momentum. By the end of the 1990s and leading into the 2000's,
emergence of the 1Gbps, 10Gbps and even faster Ethernet performance
levels seen today has obscured the fact that the modern Internet of
the 21st century is still operating with 20th century MTUs!
To bridge this gap, increased OMNI interface deployment in the near
future will provide a virtual link type that can pass IP parcels over
paths that transit legacy data links with small MTUs. Performance
analysis has proven that (single-threaded) receive-side performance
is bounded by transport layer protocol segment size, with performance
increasing in direct proportion with segment size. Experiments have
also shown measurable (single-threaded) performance increases by
including larger numbers of segments per parcel, with steady
increases for including increasing number of segments. However,
parallel receive-side processing will provide performance multiplier
benefits since the multiple segments that arrive in a single parcel
can be processed simultaneously instead of serially.
In addition to the clear near-term benefits, IP parcels and advanced
jumbos will increase performance to new levels as future links with
very large MTUs in excess of 65535 octets begin to emerge. With such
large MTUs, the traditional CRC-32 (or even CRC-64) error checking
with errored packet discard discipline will no longer apply for large
parcels and advanced jumbos. Instead, packets larger than a link-
specific threshold will include Forward Error Correction (FEC) codes
so that errored packets can be repaired at the receiver's data link
layer then delivered to higher layers rather than being discarded and
triggering retransmission of large amounts of data. Even if the FEC
repairs are incomplete or imperfect, all parcels can still be
delivered to higher layers where the individual segment checksums
will detect and discard any damaged data not repaired by the link
and/or adaptation layers (advanced jumbos on the other hand would
require complete FEC repair).
These new "super-links" will begin to appear mostly in the network
edges (e.g., high-performance data centers), however some space-
domain links that extend over enormous distances may also benefit.
For this reason, a common use case will include super-links in the
edge networks of both parties of an end-to-end session with an OMNI
link connecting the two over wide area Internetworks. Medium- to
moderately large-sized IP parcels over OMNI links will already
provide considerable performance benefits for wide-area end-to-end
communications while truly large parcels and advanced jumbos over
super-links can provide boundless increases for localized bulk
transfers in edge networks or for deep space long haul transmissions.
Templin Expires 8 October 2023 [Page 47]
Internet-Draft IP Parcels April 2023
The ability to grow and adapt without practical bound enabled by IP
parcels and advanced jumbos will inevitably encourage new data link
development leading to future innovations in new markets that will
revolutionize the Internet.
Until these new links begin to emerge, however, parcels will already
provide a tremendous benefit to end systems by allowing applications
to send and receive segment buffers larger than 65535 octets in a
single system call. By expanding the current operating system call
data copy limit from its current 16-bit length to a 32-bit length,
applications will be able to send and receive maximum-length parcel
buffers even if parcellation is needed to fit within the interface
MTU. For applications such as the Delay Tolerant Networking (DTN)
Bundle Protocol [RFC9171], this will allow transfer of entire large
protocol objects (such as DTN bundles) in a single system call.
Continuing into the future, a natural progression beginning with IP
packets then moving to IP parcels should also lead to wide scale
adoption of advanced jumbos. Since advanced jumbos carry only a
single very large transport layer data segment, loss of even a single
jumbogram could invoke a major retransmission event. But, with the
advent of error correcting codes, future link types could offer truly
large MTUs. Advanced jumbos sent over such links would then be
equipped with an error correction "repair kit" that the link far end
can use to "patch" the jumbogram allowing it to be processed further
by upper layers. Delay Tolerant Networking (DTN) over high-speed and
long-delay optical links provides an example environment suitable for
such large packets.
Appendix E. Change Log
<< RFC Editor - remove prior to publication >>
Changes from earlier versions:
* Submit for review.
Author's Address
Fred L. Templin (editor)
Boeing Research & Technology
P.O. Box 3707
Seattle, WA 98124
United States of America
Email: fltemplin@acm.org
Templin Expires 8 October 2023 [Page 48]