INTERNET-DRAFT Michael Welzl draft-welzl-ptp-01.txt University of Linz Expiration Date: 23. February 2000 Telecooperation Department Standards Track August 1999 The Performance Transparency Protocol (PTP) Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC 2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Abstract More and more modern Internet applications need to know certain network performance parameters in order to adapt. Recent efforts have shown that it is indeed possible to determine quite a few of these parameters for end-systems by probing the network, but the methods used are often time-consuming, always show network-unfriendly behaviour and in some cases the results are just not precise enough. We describe a protocol that allows end-systems to efficiently retrieve this information without putting much load onto the network or complexity in routers. Author's Note While this memo has an acknowledgements section, it was decided to mention Alfred Cihal beforehand because he contributed so much. Michael Welzl Expires: 23. February 2000 [Page 1] Internet Draft draft-welzl-ptp-00.txt August 1999 1. Definitions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY" and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. When referring to network performance parameters, "GOOD", "BAD", "BETTER", "WORSE", "BEST" and "WORST" refer to values yielding transmission results that are generally conceived in this manner. For instance, much bandwidth is good whereas much delay is bad. An "interface" is defined as the sum of all connections between two routers. This "sum" is defined as a value that is representative of all interfaces; it might actually be the sum (this could be valid for the nominal bandwidth, for example) or any other function, depending on the data in question. Unless otherwise noted, only outgoing traffic on an interface is of interest. The term "incoming traffic" refers to incoming traffic at the router's interface where the current PTP-packet has arrived. "Senders" and "Receivers" are applications that send or receive PTP- packets, respectively. In some cases, an application will act as both a sender and receiver. "Applications" can be actual Internet applications or any higher-level protocol that is making use of PTP. TTL is the IP header's "Time to Live" field. 2. Introduction Traditional Internet applications regard the network as a black-box that provides them with some services while maintaining certain characteristics [Jacobson88]. Even a bandwidth-adaptive application can stick with this point of view and try to make use of as much bandwidth as possible by flooding the network with UDP packets to see how many of them arrive within a certain period. Clearly, this is neither an efficient nor a network-friendly way to detect the currently available bandwidth or any other network performance parameter. Among others, [Bolot], [Paxson] and [Jacobson97] have demonstrated how certain router-specific network performance parameters can be estimated without having to change the software in the routers involved. [Carter] and [Sisalem] make use of these rather bandwidth consuming approaches to enhance the performance of Internet applications, indicating that the information obtained from the black box model is not sufficient for this purpose. RFC 2481 [1] also Michael Welzl Expires: 23. February 2000 [Page 2] Internet Draft draft-welzl-ptp-00.txt August 1999 states that the black-box model is inappropriate for delay- or loss- sensitive applications. Using SNMP, network administrators can remotely monitor routers; various tools (Probably the most popular one being the "Multi Router Traffic Grapher" [MRTG]) help them keep track of the data so they can estimate less evident factors such as bandwidth utilization. Generally, MIB access is restricted to administrators. Concerning performance parameters, it would be desirable to make this kind of functionality available to all Internet applications. Other than network administrators, the everyday Internet user is interested in the overall network performance, which is the worst performance packets encounter along the path in use (several paths for multicast). Since this path may change at any time, router- specific data should be retrieved as quickly as possible. PTP provides this functionality in an efficient yet simple way designed to avoid flooding the network. It could also be implemented as an extension to SNMP or even RSVP. Since the performance transparency concept excludes access restrictions, resource reservation or any other form of prioritisation, it was decided that a new protocol would be the most appropriate solution. Besides, PTP usage is open to Internet applications as well as any higher-level protocol. 3. Protocol overview This memo describes a technique to efficiently retrieve performance specific information from routers. The PTP model merely consists of a sender that sends any number of PTP packets and a receiver receiving them. As stated earlier, the sender and receiver may be the same. There are two basic methods to get the information: (1) PTP packets are examined by PTP-compliant routers along the path in between the sender and the receiver. They will add their own information and check if the packet has encountered a non-PTP- compliant router. If this is the case and applicable information is being asked for, routers add data for incoming traffic, too. This way, it is possible to collect all the information for a path where single routers don't support PTP. A receiver can tell if the information is complete. (2) An application may want set the Compare Bit to literally ask if certain traffic requirements will be met by the network. Routers will then compare the default values that were set by the sender with their own and make sure the data is updated with the worst values found. This mechanism was designed for use in conjunction Michael Welzl Expires: 23. February 2000 [Page 3] Internet Draft draft-welzl-ptp-00.txt August 1999 with the Echo Bit which tells a router to return the packet to the sender if worse conditions are detected. Implementation of the Compare mechanism in routers is optional. Most applications that use PTP will need to do a reasonable amount of communication not described within this memo, e.g., to send computational results back to a sender or for acknowledgements. If a receiver needs to calculate the maximum amount of non-PTP-compliant routers a packet has encountered, it must be informed about the sender's initial TTL value. How this is done is not part of the PTP documentation, but since PTP is an unreliable protocol, usage of TCP for related communication is recommended. While PTP itself does not provide any multicast functionality, it is possible to send PTP packets to various receivers. It should be noted that results derived from PTP multicasts with set Echo Bits must be interpreted carefully. 4. Specification 4.1. IP requirements and recommendations Hosts and routers that implement PTP need to follow a set of rules with regard to the IP header: o Protocol The "Protocol" field must be set to 123. o Fragmentation The "Don't Fragment" bit must be set by the sender. Generally, PTP packets will not be very large and are not likely to exceed the size of 576 octets (RFC 791 [6] states that a host must be prepared to accept datagrams of that size). In any case, the use of Path MTU Discovery - preferably based on the approach mentioned in this memo (see Content Type 2) rather than RFC 1191 [2] - is highly recommended. Michael Welzl Expires: 23. February 2000 [Page 4] Internet Draft draft-welzl-ptp-00.txt August 1999 o TTL RFC 791 [6] defines TTL as measured in units of seconds, yet RFC 1812 [8] states that "many current implementations treat TTL as a pure hop count, and in parts of the Internet community there is a strong sentiment that the time limit function should instead be performed by the transport protocols that need it". Furthermore, the fact that TTL has been replaced by a "Hop Limit" without any time function in IPv6 seems to strengthen that trend [9]. For these reasons, it is safe to require PTP-compliant routers to treat TTL like the "Hop Limit" described in RFC 1883 [9]. o Router Alert Option The "Router Alert Option" described in RFC 2113 [5] MUST be used with a value of 0 by a sender and SHOULD be supported by routers. 4.2. PTP Header Format A packet has a fixed length header: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TTL-Check |C|E| DS Length | Dataset Count | Content Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Port | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ TTL-Check: 8 bits On forwarding, a router must copy TTL to the TTL-Check field. A sending host will normally do the same, but it can also set TTL-Check to any value <> TTL to request information about incoming traffic from the first-hop router. If a router encounters a packet with TTL-Check <> TTL, the Compare Bit is 0, the Content Type is indicating a request for interface- specific information and at least part of this data is at hand for incoming traffic, a dataset must be added containing: o as much of the requested interface-specific information as possible for incoming traffic o the address of the interface where the packet arrived in the dataset's address field (if such a field is provided). Michael Welzl Expires: 23. February 2000 [Page 5] Internet Draft draft-welzl-ptp-00.txt August 1999 If the Compare Bit is set and TTL-Check - TTL > 1 or no available incoming traffic information can be used for comparison, the packet must be dropped. If the Compare Bit is set and TTL-Check - TTL = 1, the incoming traffic data must be used for the comparison process described for the Compare Bit instead of adding a new dataset. Bit 8 - the Compare Bit If the Compare Bit is set, there is only one dataset. It contains values that have been set by the sender. A router that receives a packet with this bit set compares all these values with its own (including incoming traffic data if available and TTL-Check - TTL = 1) and updates the dataset's according fields with the worst values found. If a comparison is impossible, no change should be made. If provided, an address field must always contain the relevant interface's address of the router that did the last update. Implementation of this mechanism is optional; a router may drop packets with a set Compare Bit, but never forward them without proper treatment. It may also be implemented for certain Content Types only so that packets get dropped when their Compare Bit is set in conjunction with an unsupported Content Type. It is not advisable for a sender to set the Compare Bit but not the Echo Bit unless the packet size is expected to exceed the Path MTU. Bit 9 - the Echo Bit The Echo Bit may only be set in conjunction with the Compare Bit by a sender. Packets that don't follow this rule must be forwarded without further treatment. Only if the Echo Bit is set AND the Compare Bit mechanism resulted in a change of the dataset, a router with Echo Bit support will: o copy the IP "Source Address" field to the IP "Destination Address" field o write the forwarding interface's IP address into the IP "Source Address" field o set TTL and TTL-Check to the same initial value (usually 64 [8]) o clear the Compare Bit Michael Welzl Expires: 23. February 2000 [Page 6] Internet Draft draft-welzl-ptp-00.txt August 1999 o forward the packet accordingly Implementation of this mechanism is optional; a router may drop packets with a set Echo Bit, but never forward them without proper treatment. Dataset Length: 6 bits Half the length of a dataset in octets (depending on the Content Type in use). Dataset Count: 8 bits The number of datasets currently in the packet. This field must be 1 if the Compare Bit is set. At the receiver, (initial TTL - TTL) - Dataset Count represents the maximum amount of non-PTP-compliant routers the packet has encountered if the Compare Bit is not set. Content Type: 8 bits This field is used to describe the dataset structure. It will be subject to future standardizations; for now, only 3 values are defined. Data types beginning with "MIB2-" are defined as Object- types in RFC 1213 [4]. Interface-specific data and addresses always refer to the interface where the PTP packet will be forwarded or incoming traffic in special cases (see TTL-Check), so for all objects from the RFC 1213 [4] Interfaces group, the words "In" and "Out" will be omitted. The sum of the sizes of all fields within a dataset is a multiple of two octets. Depending on the Compare Bit, routers must either read an existing dataset or add one or two new ones. In either case, the dataset(s) will have the structure defined by the Content Type. If such a field is provided, a router that writes to a dataset must write the forwarding interface's IP address into the dataset's address field. Defined values: (0) +--------------+ | | | MIB2-ifSpeed | | | +--------------+ Michael Welzl Expires: 23. February 2000 [Page 7] Internet Draft draft-welzl-ptp-00.txt August 1999 In most cases, ifSpeed will contain the nominal bandwidth. This Content Type can be particularly useful in conjunction with the Echo Bit. The Compare Bit may be used in conjunction with this Content Type; a greater ifSpeed value will be regarded as BETTER. (1) +---------+-----------+---------------+ | | | | | Address | Timestamp | MIB2-ifOctets | | | | | +---------+-----------+---------------+ Address is the forwarding interface's IP address. The Timestamp format is the same as defined in RFC 791 [6] for the IP timestamp. This Content Type can be used to build a table of routers with according timestamps and ifOctets-values in order to estimate the actual instantaneous bandwidth somewhat similar to the way MRTG does by default [MRTG]. This is sensitive to path changes. The Compare Bit MUST NOT be used in conjunction with this Content Type. (2) +------------+ | | | MIB2-ifMTU | | | +------------+ Using this Content Type, Path MTU discovery as described in RFC 1191 [2] can be replaced by PTP, provided that enough routers implement it. Since router support is a general assumption for the use of this protocol, senders expecting packet size related problems should make use of Content Type 2 before sending other PTP packets. The Compare Bit may be used in conjunction with this Content Type; a greater ifMTU value will be regarded as BETTER. Destination Port: 16 bits The Destination Port is used to identify the receiver. Michael Welzl Expires: 23. February 2000 [Page 8] Internet Draft draft-welzl-ptp-00.txt August 1999 Checksum: 16 bits Since the checksum must be recomputed by every router that writes to a PTP packet, its calculation follows the simple algorithm used for both the IP and UDP checksum [6] [7]. Other than the IP checksum, it is a checksum on a complete PTP packet. The algorithm is: The checksum field is the 16 bit one's complement of the one's complement sum of all 16 bit words in the PTP packet. For purposes of computing the checksum, the value of the checksum field is zero. 4.3. PDU format The protocol data unit consists of datasets. Their size (Dataset Length * 2) is given by their structure (Content Type) and is a multiple of 2 octets, so padding is not necessary. All datasets are structured equally. The number of datasets is denoted by Dataset Count. 5. IPv6 usage Although all other IP references in this memo refer to IPv4, PTP is IPv6-compliant. There are a few differences in the way it is used, though: o IPv6 addresses are 128-bit instead of 32-bit, so a dataset's address field will be 128-bit, too. o TTL is called a "Hop Limit" in IPv6 and is not subject to any restrictions. The concept of this field being a time limit has already been dropped. o There is no "Don't Fragment" flag - IPv6 packets can only be fragmented by the sender itself, which obviously MUST NOT be done. Path MTU discovery using PTP is still possible, an alternate method for use with IPv6 is described in RFC 1981 [3]. o The "Protocol" field is replaced by the "Next Header" field. Its function remains the same. o By the time of writing, there is no RFC on how the "Router Alert Option" [5] could be applied to IPv6, but if such documentation becomes available, it should be used. 6. Security Considerations In RFCs and IETF Working Groups, there is a very strict separation between network management and transport protocols. PTP is by no Michael Welzl Expires: 23. February 2000 [Page 9] Internet Draft draft-welzl-ptp-00.txt August 1999 means a network management protocol, yet it allows access to data specified in RFC 1213 [4] somewhat similar to the SNMP "read" command. At first sight, network administrators will have their concerns about that. They must accept that certain performance-specific parameters will have to be available to the public in order to make the best utilization of the network. Information about delay and bandwidth should not be regarded as confidential, a certain amount of transparency is necessary. Care has been taken to restrict the Content Types presented in this document to information that is already available to any Internet application via the use of more bandwidth-consuming methods. Content Type 0 contains the nominal or current bandwidth - pathchar can determine the nominal bandwidth of any link along a path [Jacobson97]. Content Type 1 can be used to estimate the current bandwidth; for the bottleneck bandwidth router, this has been explained in [Carter]. The timestamp and address alone could just as well be determined via IP options, a similar functionality is provided by traceroute. RFC 1191 [2] and RFC 1981 [3] elaborate on how to determine the PathMTU (Content Type 2). While it is impossible to steal secure information with PTP, it is insecure in that the throughput of applications relying on PTP data could be restricted if the packets include false information. Programmers using PTP should be aware of that and consider occasionally probing the network in case of very unexpected or particularly bad results. 7. IANA Considerations Future values for the PTP header's 8-bit Content Type field (in this memo, only the values 0, 1 and 2 are defined) require IANA assignment. Following the policies outlined in [IANA-CONSIDERATIONS], numbers in the range 3-199 are allocated via a required specification and numbers in the range 200-255 are allocated through an IETF Consensus action. It would be desirable to have IANA administrate the specifications for numbers in the range 3-199 and make assignments without requiring an outside review. In order for a number to be assigned, the following issues concerning the dataset structure that will be identified through the Content Type field must be addressed in the specification: o The name (the names already used in this memo may be used but not redefined), size (the sum of the sizes of all fields within a Michael Welzl Expires: 23. February 2000 [Page 10] Internet Draft draft-welzl-ptp-00.txt August 1999 dataset must be a multiple of two octets) and number of fields o A description of each field (a field can be a reference to a certain field in a RFC) o What kind of new information this Content Type provides o Why the information that can be obtained via this field should not be regarded as confidential o Whether use of the Compare Bit will be appropriate, and if so, a definition of "BETTER" or "WORSE" References [1] Ramakrishnan, K., and Floyd, S., "A Proposal to add Explicit Congestion Notification (ECN) to IP", RFC 2481, January 1999. [2] Mogul, J.C., and Deering, S.E., "Path MTU discovery", RFC 1191, November 1990. [3] McCann, J., Deering, S. and Mogul, J., "Path MTU Discovery for IP version 6", RFC 1981, August 1996. [4] McCloghrie, K., and Rose, M.T., "Management Information Base for Network Management of TCP/IP-based internets:MIB-II", STD 17, RFC 1213, March 1991. [5] Katz, D., "IP Router Alert Option", RFC 2113, February 1997. [6] Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981. [7] Postel, J., "User Datagram Protocol", STD 6, RFC 768, August 1980. [8] Baker, F., "Requirements for IP Version 4 Routers", RFC 1812, June 1995. [9] Deering, S., and Hinden, R., "Internet Protocol, Version 6 (IPv6) Specification", RFC 1883, December 1995. [Bolot] Jean-Chrysostome Bolot, "End-to-End Packet Delay and Loss Behavior in the Internet", Proceedings of SIGCOMM 1993, pp. 289-298. also in Computer Communication Review 23 (4), Oct. 1992. [Carter] Robert L. Carter and Mark E. Crovella, "Measuring Bottleneck Link Speed in Packet-Switched Networks", Technical Report BU-CS-96- 006, Boston University, 1996. Michael Welzl Expires: 23. February 2000 [Page 11] Internet Draft draft-welzl-ptp-00.txt August 1999 [IANA-CONSIDERATIONS] Alvestrand, H. and T. Narten, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 2434, October 1998. [Jacobson88] Van Jacobson, "Congestion Avoidance and Control", Proceedings of SIGCOMM 1988, pp. 314-329. [Jacobson97] Van Jacobson, "pathchar - a tool to infer characteristics of Internet paths", presented at the Mathematical Sciences Research Institute (MSRI); slides available from ftp://ftp.ee.lbl.gov/pathchar/, April 1997. [MRTG] The "Multi Router Traffic Grapher", available from http://ee- staff.ethz.ch/~oetiker/webtools/mrtg/mrtg.html [Paxson] Vern Paxson, "End-to-End Internet Packet Dynamics", IEEE/ACM Transactions on Networking 7(4), pp. 1-16. [Sisalem] Dorgham Sisalem, Henning Schulzrinne, "The Loss-Delay Based Adjustment Algorithm: A TCP-friendly Adaptation Scheme", Proceedings of NOSSDAV, July 1998. Acknowledgements Without the help of these people, this memo would look quite different and, without a doubt, much worse (in alphabetical order): Alfred Cihal Jon Crowcroft Max Muehlhaeuser Michael Welzl Expires: 23. February 2000 [Page 12] Internet Draft draft-welzl-ptp-00.txt August 1999 Author's Address Michael Welzl University of Linz Institute for Technical Computer Science and Telematics Telecooperation Department Altenberger Str. 69 4040 Linz, Austria Phone: +43 (732) 2468-9264 Fax: +43 (732) 2468-9829 Email: michael@tk.uni-linz.ac.at Michael Welzl Expires: 23. February 2000 [Page 13] Internet Draft draft-welzl-ptp-00.txt August 1999 Full Copyright Statement Copyright (C) The Internet Society (1997). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implmentation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." Michael Welzl Expires: 23. February 2000 [Page 14]