Network Working Group Naiming Shen Internet Draft Redback Networks Steven Luong June 2002 Cisco Systems Expires: December 2002 ISIS IIH Sequence Number Scheme 1. Status of This Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 2. Abstract This draft describes an optional sequence number TLV inside the ISIS IIH packets. This sequence number TLV can be used for ISIS adjacency troubleshooting especially in the case where a large number of adjacencies are maintained and/or a low adjacency holddown time is used for the purpose of fast convergence. 3. Introduction The IS-IS [1, 2] uses hello protocol to establish and maintain neighbor adjacencies. It is important for IS-IS protocol to send out its IIH packets to its neighbors on time, to receive and process the IIH packets from its neighbors on time. This document specifies an optional mechanism to include a sequence number inside the IIH packet to help operators to detect adjacency problems and for protocol to use this information. 4. Motivation IIH header contains the holding time for the adjacency. When the IS receives a new IIH from its neighbor, it will update the Shen & Luong Expires December 2002 [Page 1] Internet Draft ISIS IIH Sequence June 2002 holddown time for this adjacency. It will tear down the adjacency if it does not receive any more IIH from the neighbor during this holddown time. Usually the sender sends multiple IIHs during this holddown period. There are many cases an adjacency being timed out. Here are some of the important ones: - configure large number of adjacencies on a router or on an interface, either in the real world or during Lab scalability testing. - IIH holddown time is configured to be low, especially in the case of fast convergence. - have congestion over the link, the interface outbound buffer or inbound buffer. - router process scheduling slips due to the system load or implementation. - have network churns and massive IS-IS LSP flooding. When router B fails to receive a new IIH from neighbor router A, the problem can happen at multiple places. It might be router A IS-IS does not have chance to send out IIH on time; or router A does try to send during this holddown time, but is dropped due to outbound congestion on router A; or it might be dropped at router B's inbound due to congestion; or it might be router B IS-IS does not have time to process the IIH. If there is a switch between A and B, it might also be the switch drops the IIH. When a sequence number is used with IIH packet, it gives some clues to the user when there is an adjacency problem. When the sequence number is attached to the IIH, the receiver will remember the last sequence number of IIH from the neighbor. If there is IIH packet loss due to congestion, there will be a sequence number skip when receiving the following IIHs. On the point-to-multipoint media, if only one receiving router has skipped IIH sequence numbers, it probably means this receiving router has inbound congestion; if all the routers receive skipped sequence IIHs from the same neighbor, that would indicate the sender outbound has congestion problem; if only one receiver timed out the neighbor without having sequence number skip problem, that probably means the receiver is having scheduling problem; and if all the receivers time out the same neighbor with no sequence number skip, that would indicate the sender is too busy and missed sending out IIHs on time. Even in the point-to-point case, the sequence number of IIH will give some clue if there is packet loss of IIH or not over the link. 5. IS-IS IIH sequence number scheme An optional sequence number TLV can be included in the IIHs. It is a 32 bits unsigned number starts from 1. The IIH sequence is Shen & Luong Expires December 2002 [Page 2] Internet Draft ISIS IIH Sequence June 2002 maintained per IS-IS interface and per level. For a LAN interface, it can have two sets of sequence numbers if the circuit belongs to both level-1 and level-2. For point-to-point interface, only one sequence number is needed. The sequence number is increased by one for every new IIH sent out for the level over the interface. If the last adjacency on the interface is removed, the sequence number can be reset. On the receiving side of the IIHs, the sequence number from neighbor's IIH is recorded for each adjacency. A sequence number history can be maintained for the adjacency. The sequence number skip needs to be recorded, maybe along with the time it occurred. A sequence number reset by neighbor should be considered as normal, because it can happen when sequence number wraps, or the neighbor performed a hitless restart. 6. Packet Encoding of IIH Sequence Number TLV TLV number of TLV is 241. It is only used in IS-IS IIH packets and should be ignored otherwise. x code - 241 x length - 4 octets x value - 32 bits unsigned number, 0xFFFFFFFF is currently reserved and it may have a special meaning x name - IS-IS IIH Sequence Number TLV 7. Interaction with TLVs using PDU data to compute signature The IIH Sequence Number TLV can appear anywhere in the IIH before the padding. Implementation that supports optional checksum [3] or HMAC-MD5 authentication [4] must include the sequence number TLV for the computation. 8. Remarks The sequenced IIH is mainly used for adjacency problem detection and troubleshooting. It gives implementors and operators a tool to find problems such as switch congestion, link congestion, buffering issues, protocol packet prioritize issues, process scheduling issue. It is especially useful in the case of large number of adjacencies and lower holddown time for fast convergence. IS-IS protocol may also utilize this sequence number scheme. For example, if a receiving router detects there is a sequence number skip on the link, it can assume it may happen again in a short period of time. This receiving router also knows that the neighbor is still alive by the fact that it just received its IIH(even though it missed its previous one). Thus the receiving router can Shen & Luong Expires December 2002 [Page 3] Internet Draft ISIS IIH Sequence June 2002 optionally increase the "holddown" time for this particular adjacency. If later the receiving router continuously receives good IIHs, it can restore the "holddown" time to the normal one. This readjustment of holddown time can sometimes prevent a premature adjacency flap due to the temporary conditions mentioned above. 9. Security Considerations This document introduces no new security concerns to IS-IS or other specifications referenced in this document. 10. Acknowledgments The authors would like to thank Jun Zhuang, Albert Tian and Tony Przygienda for useful discussions on this subject. 11. References [1] ISO 10589, "Intermediate System to Intermediate System Intra- Domain Routeing Exchange Protocol for use in Conjunction with the Protocol for Providing the Connectionless-mode Network Service (ISO 8473)" [Also republished as RFC 1142] [2] RFC 1195, "Use of OSI IS-IS for routing in TCP/IP and dual environments", R.W. Callon, December 1990 [3] Internet Draft, Optional Checksums in ISIS, work-in-progress, Internet Engineering Task Force, T. Przygienda, Apr 2002 [4] Internet Draft, IS-IS Cryptographic Authentication, work-in-progress, Internet Engineering Task Force, T. Li, R. Atkinson, July 2001 12. Authors' addresses Naiming Shen Redback Networks 350 Holger Way San Jose, CA, 95134 USA naiming@redback.com Steven Luong Cisco Systems, Inc. 170 West Tasman Drive San Jose, CA 95134-1706 USA sluong@cisco.com Shen & Luong Expires December 2002 [Page 4]