Internet Draft Lixia Zhang, UCLA Expiration: August 1996 File: draft-ietf-rsvp-diagnostic-msgs-00.txt RSVP Diagnostic Messages Status of Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." To learn the current status of any Internet-Draft, please check the "1id-abstracts.txt" listing contained in the Internet- Drafts Shadow Directories on ds.internic.net (US East Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific Rim). Abstract This draft describes the RSVP diagnosis facility. As the deployment of RSVP has spread, it has become clear that a method for collecting information about the RSVP state along the way from a receiver to a particular source is needed. This specification describes the required functionality, packet format, and processing rules. 1. Introduction In the original design of RSVP, error messages are the only means for the end hosts to receive feedback information regarding a specific request that has failed, a failure in setting up either a PATH state or reservation state. In absence of a failure, one receives no feedback regarding the details of a reservation that has been put in place, such as whether, or where, or how, one's own reservation request is merged with that of others. In case of a failure, the error message carries back only the information from the failed point, without any information about the state at other hops before the failure. Such missing information, however, can be highly desirable for debugging purpose, or even be interesting to end applications. In this memo we specify an RSVP diagnostic facility to collect information of RSVP state along the path from a receiver to a specific sender. Diagnostic messages are independent from any other RSVP control messages and are side-effect free. That is, they do not change any RSVP state at either routers or hosts. A diagnostic reply is not an error report, but a collection of RSVP state information as requested. We have the following design goals in mind: - To be able to collect RSVP state information at every hop along the path once the PATH state has been set up, either for an existing reservation or before one has made any reservation request. - More specifically, to be able to collect information about flowspec, timer values, and reservation merging at each hop along the path. - To be able to collect hop count for each non-RSVP cloud the path crosses. - To avoid packet implosion or explosion. Here the "hop" is defined as RSVP-capable routers. The following are specifically identified as non-goals for the time being: - Checking the resource availability along a path. Such functionality may be useful for future reservation requests, but would require modifications to existing admission control module which is beyond our control. 2. Overview We define two types of diagnostic packets, diagnostic request (DREQ) and reply (DREP). To avoid packet implosion or explosion, we restrict diagnostic packets to unicast only (but see Section 5.2 on firewall issues). The requesting host is not necessarily have the receiving end of the delivery path that is to be queried. The requester simply sends an RSVP Diagnostic request packet (DREQ) to the last-hop router of the path. The DREQ packet specifies the RSVP session and a sender host to that session. The last-hop router adds to the DREQ packet a response data block containing its RSVP state for the specified RSVP session, and then forwards the request via unicast to the router that it believes is the proper previous hop for the given source. Each subsequent hop adds its own response data to the end of the request packet, then unicast forwards to the previous hop. When the DREQ packet reaches the sender, the sender then changes the packet type to Diagnostic Reply (DREP) and sends the completed response to the original requester. The response may also be returned before reaching the sender if any error condition along the path, such as "no path state", prevents further forwarding of the request packet. DREP packets can be unicast back to the requester either directly, or in a hop-by-hop manner by reversing the exact path that the DREQ packet has taken. The former is faster and more efficient, but the latter may be needed when the packets have to go across firewalls. To facilitate the latter case, DREQ packet may carry an optional ROUTING object, which is a list of router addresses that the packet has passed through on the way to the sender. The DREP packet can then be returned to the requester by revering the path. When the path consists many hops, it is possible that the total length of a DREP packet will exceed the path MTU size and the packet has to be fragmented. Relying on IP fragmentation and reassembly, however, is troublesome, especially when DREP packets are returned to the requester hop-by-hop, in which case fragmentation/reassembly would have to be performed at each hop. To avoid such excessive overhead, We propose to define a default MTU value, and once an intermediate router detects that a DREQ packet size reaches the pre-defined MTU size, it returns the partial result to the requester, and then forwards the trimmed DREQ packet to the next hop towards the sender. Therefore through out this memo we use the word "DREQ packet", rather the word "message" to call a diagnostic request which always consists of a single packet. On the other hand, one diagnostic request can generate multiple DREP packets, each containing a fragment of the total reply. Notice that one can forward DREQ packets only after the path state has been set up. Otherwise one may resort back to the traceroute facility to examine whether the unicast/multicast routing is working correctly. 3. DREQ / DREP Packet Format A diagnostic packet consists of three parts: the RSVP common header, diagnostic packet header, and response data object. 3.1 RSVP Message Common Header In the RSVP message common header, 0 1 2 3 +-------------+-------------+-------------+-------------+ | Vers | Flags| Type | RSVP Checksum | +-------------+-------------+-------------+-------------+ | RSVP Length | reserved | Send_TTL | +-------------+-------------+-------------+-------------+ | Message ID | +--------+-+--+-------------+-------------+-------------+ |Resved |MF| Fragment offset | +--------+-+--+-------------+-------------+-------------+ Flags field is unused for now. Type = 8: DREQ Type = 9: DREP RSVP Checksum covers the entire packet body including this header. Send_TTL holds IP TTL value that a router puts in the IP header when a DREQ packet is forwarded to the previous hop. Message ID identifies an individual DREQ packet and corresponding reply (or all the fragments of the reply). MF flag is on for all but the last packet (fragment) in a Diagnostic Reply. Fragment Offset field gives the byte offset of the current fragment in the complete Diagnostic Reply. 3.2 RSVP Diagnostic Packet Header Object Both the DREQ/DREP header is a concatenation of Diagnostic Packet Header Object and an RSVP Session object, as defined below: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | length = 20 | class | c-type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | M-TTL | hop-count | Error code | XXXX | |H| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Response Address | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | | + RSVP Session Object | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Class field is unused for the moment and must be set to zero. C-type is used to distinguish IPv4 and IPv6. M-TTL specifies the maximum number of hops that the requester wants to collect information for. In case that some error condition in the middle of the path keeps the DREQ packet from reaching the specified sender, this field can be used to perform an expanding-length search to reach the point just before the problem. Hop-count field records the number of RSVP hops that have been traversed so far. Error code expresses the errors encountered that prevented DREQ from reaching the source. Currently defined values are 0x00: no error 0x01: lack of PATH state A possible use of the 4 bit value XXXX is discussed in Section 5.2 when the "Response Address" is a multicast address. Otherwise the value is 0. H flag indicates how the reply should be returned to the requester. When H = 0, DREP packets should be sent to the requester directly via unicast; when H = 1, DREP packets should be returned to the request in a hop-by-hop way. Source address specifies the IP address of the source for the path being traced. The DREQ packet proceeds hop-by-hop towards this source. Destination address field specifies the IP address of the receiving end for the path being traced. The DREQ packet starts at this node and proceeds toward the source. Response Address field specifies the address to which the DREP packet(s) gets sent. The session object identifies the session for which the RSVP state information is being collected. Optionally, the packet header may also contain a ROUTE Object, as defined below, right after the Session object (to be used to return DREP packets hop-by-hop) +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | length | class | R-pointer | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + List of RSVP routers | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Length field represents the total length of the object in number of bytes, from which the number of addresses in the router list can be easily computed. R-pointer is used in DREP packets only (see Section 4.2), and must be zero in all DREQ packets. (notice that this is a violation of the common object format defined in RSVP spec) List of RSVP routers lists all the RSVP hops between the Destination and the router that returns this DREP packet in a DREP packet, or the last router that has updated this list in a DREQ packet. 3.3 Response Data Each RSVP router adds a "response data" segment to the DREQ packet before it forwards it on. The response data looks like this: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | length | class | C-type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DREQ Arrival Time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Incoming Interface Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Outgoing Interface Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Previous-RSVP-Hop Router Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | reservation style | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | D-TTL |M|R-error| K | timer value | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Tspec object | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | filter spec object | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | flowspec object | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ DREQ Arrival Time is a 32-bit NTP timestamp specifying the arrival time of the DREQ packet at this router. The 32-bit form of an NTP timestamp consists of the middle 32 bits of the full 64-bit form; that is, the low 16 bits of the integer part and the high 16 bits of the fractional part. Incoming Interface Address specifies the address of the interface on which packets from this source are expected to arrive, or 0 if unknown. Outgoing Interface Address specifies the address of the interface on which packets from this source flow to the specified destination, or 0 if unknown. Previous-RSVP-Hop Router Address specifies the router from which this router receives PATH packets from this source, or 0 if unknown. Reservation style is the 4 byte value of RSVP Style Object as defined in RSVP specification. D-TTL specifies the hop count this DREQ packet traveled from the down-stream RSVP router to the current router. M-flag indicates whether the reservation, as described by the objects below, is merged with reservations from other downstream interfaces when being forwarded upstream. R-error indicates error conditions at a router. Currently defined values are 0x00: no error 0x01: no reservation K is the parameter defined in RSVP, and timer value is the local refresh timer value in second. The rest parts, Tspec, filter spec, and flowspec objects follow the definitions given in RSVP specification. The latter two may be absent (see Section 4.1 on DERQ forwarding). 4. Diagnostic Packet Forwarding Rules 4.1 DREQ Forwarding DREQ packets are forwarded hop-by-hop via unicast from the destination address to the source address as specified in the diagnostic packet header. Each hop carries out the following processing before forwarding the packet to the next hop towards the source: - Compute the routing hop count from the previous RSVP hop. This is done by comparing the value of Send_TTL with the TTL value in the IP header. The result is then put into the D-TTL field of the response data. - Attach its response data block to the end of the DREQ packet. Tspec, filter spec, and flowspec objects in the data block describe the reservation in place at the Outgoing Interface for the specified session. If there is no reservation state, then the response data block will contain no filter-spec or flowspec; it should still have the Tspec value for the specified sender that has been carried to the router by PATH messages. - increment hop-count by one. - If the hop-count value is equal to that of M-TTL, or if the current hop is the sender as identified by the "Source Address" in the RSVP diagnostic header, go to Send_DREP(), and then return. - If the resulting DREQ packet size exceeds a pre-defined MTU limit (minus some margin to hold the address list, see blow), go to Send_DREP(). - Count the number or response data blocks, N, increase the "Fragment offset" value in RSVP common header by (24N + N X M)). (Section 3.3 defines the first 24 bytes in the response block, M is the length of Tspec + Filterspec + Flowspec) - Trim off all the response data blocks from the original DREQ packet, then forward to the next hop towards the source. Send_DREP(): 1. If the H flag in the Diagnostic Header header is off, o Make a copy of the packet, change the packet type from DREQ to DREP, and send it to the response address given in the DREQ header.*** o Return. 2. Create a ROUTE Object (as defined in Section 3.2.2), Rnew. R-pointer value is filled by counting the number of response data blocks in the DREQ packet. The list of RSVP routers is filled by taking the "Incoming Interface Address" from each of the response data block. 3. If the DREQ packet already contains a Router Object, Rold, merge Rnew with Rold by adding the R-point in Rnew to that in Rold, and attach the Router list of Rnew to the end of that in Rold. 4. Insert the resulting Route object from Steps 2 & 3 between Session object and the first response data block. 5. Make a copy of the resulting DREQ packet, change the packet type from DREQ to DREP, and send the packet to the last address in ROUTE object. 6. Return. 4.2 DREP Forwarding When the H flag is off, DREP packets are sent directly to the original requester. When H flag is on, however, they are forwarded hop-by-hop to the requester, by reversing the route as listed in the Route object. When a router receives a DREP packet, it simply decreases R-pointer by one (address length), and forward the packet to the address pointed by R-pointer in the route list. When the destination router receives a DREP packet, it sends the packet to the Response address. 4.3 Errors If an error condition prevents a DREP packet from being forwarded further, the packet is simply dropped. If an error condition, such as lack of PATH state, prevents a DREQ packet from being forwarded further, the router must change the current packet to DREP type and return it to the response address. 5. Problem Diagnosis by Using RSVP Diagnostic Facility 5.1 Broken Intermediate Router A broken (or legacy) intermediate RSVP router may simply not understand diagnostic packets, and drop them. The querier would then get no response at all from its requests. It may then choose to first do a multicast traceroute (in case of multicast) to get information about the route length, and then perform an RSVP diagnosis search by gradually increasing the value of M_TTL field until it no longer receives a response. 5.2 Across Firewalls Firewalls can cause problems in diagnostic packet forwarding. Let us look at two different cases. First, let us assume that the querier is a receiving host of the session to be examined. In this case, firewalls should not prevent the forwarding of the diagnostic packets in a hop-by-hop manner, assuming that proper holes have been punched on the firewall to allow hop-by-hop forwarding of other RSVP packets. The querier may start by setting the H flag off, which can give a faster response delivery and reduced overhead at intermediate routers. However if no response is received, the querier may resend the DREQ packet with H flag turned on. If the requester is a third party host and is separated from the destination address by a firewall (either the requester is behind a firewall, or the destination is a router behind a firewall, or even both), at this time I do not know any other solution but attempting to use multicast. To send a DREQ packet across a firewall (or firewalls), the request should be multicast to the group being examined (since the last hop router listens to that group). All routers except the correct last hop router, as identified by the destination address in the DREQ header, should ignore any DREQ request received via multicast. To receive a DREP packet across a firewall (or firewalls), the querier should set the response address to a wellknown multicast address allocated specifically for DREP packets. In this case, all the reply packets will be first unicast to the destination address, which in turn multicasts them out, with the TTL value specified in the XXXX field in the diagnostic header. This response TTL should be set to a value sufficient for the response from the destination router to reach the querier. However we choose to physically limit this value to be no more than 15, because there is only one wellknown multicast address for this purpose, therefore all the queriers from all other sessions will receive the multicast DREP packets as well. If the querier still cannot receive the DREP packet when the TTL reaches the limit, then one must consider using a node closer to the destination instead. 5.3 Examination of RSVP Timers One easily collects information about the current timer value at each RSVP hop along the way. This will be very helpful in situations when the reservation state goes up and down frequently, to find out whether the state changes are due to improper setting of timer values, or K values (when across lossy links), or frequent routing changes. 5.3 Discovering Non-RSVP Clouds The D-TTL field in each response data block shows the number of routing hops between adjacent RSVP routers. Therefore any value greater than one indicates a non-RSVP clouds in between. Together with the arrival timestamps (assuming NTP works), this value can also give some vague, though not necessarily accurate, indication of how big that cloud might be. One might also find out all the intermediate non-RSVP routers by running either unicast or multicast trace route. 5.4 Discovering Reservation Merges The flowspec value in a response data block specifies the amount of resources (whatever that means by the yet to be defined flowspec) being reserved for the data stream defined by the filter spec in the same data block. When this value of adjacent response data blocks differs, that is, a downstream router Rd has a smaller value than its immediate upstream router Ru, it indicates a merge of reservation with RSVP request(s) from other down stream interface(s) at Rd. Further, in case of SE style reservation, one can examine how the different SE scopes get merged at each hop. In particular, if a receiver sends a DREQ packet before sending its own reservation, it can discover (1)how many RSVP hops there are along the path between the specified source and itself, (2)how many of the hops already have some reservation by other receivers, and (3)possibly foresee how its reservation request might get merged with other existing ones. 5.5 Error Diagnosis In addition to examining the state of a working reservation, RSVP diagnostic packets are more likely to be invoked when things are not working or working correctly. For example, a receiver has reserved an adequate pipe for a specified incoming data stream, yet the observed delay or loss ratio is much higher than expected. In this case the receiver can use the diagnostic facility to examine the reservation state at each RSVP hop along the way to find out whether the RSVP state is set up correctly, whether there is any blackhole along the way, or whether there are non-RSVP clouds, and where, that may have caused the performance problem. 6. Acknowledgment The idea of developing a diagnostic facility for RSVP was first proposed by Mark Handley of UCL. Many thanks to Lee Breslau of Xerox PARC and John Krawczyk of Baynetworks for their valuable comments on the first draft of this memo. Authors' Addresses Lixia Zhang UCLA 4531G Boelter Hall Los Angeles, CA 90024 Phone: 310-825-2695 EMail: lixia@cs.ucla.edu