draft-kilsdonk-router-upgrade-01.txt INTERNET-DRAFT expires 9/8/2003 Network Working Group D. Kilsdonk Internet-Draft: Lucent Technologies August 9, 2002 Minimally Disruptive Router Upgrades Status of this Memo This document is an Internet-Draft and is subject to all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This document specifies a procedure for the Internet Community, and requests discussion and suggestions for improvements. Distribution of this memo is unlimited. Abstract This memo documents a process which can be used to avoid network outages during the upgrading of a router's software. In this way, disruptions to the network community are avoided. The upgrade process definition comprises requirements for inserting a replacement router or upgrading the software in a working router within an operational network. To do this, it maximizes the use of accepted internet standards with minimal alteration. [0] RFC DONK Minimally Disruptive Router Upgrades November 2001 Table of Contents 1.0 INTRODUCTION...................................................2 2.0 MINIMUM REQUIREMENTS...........................................2 2.1 REDUNDANT CONFIGURATION........................................3 2.2 AWARENESS OF REDUNDANT CONFIGURATION...........................3 2.3 AWARENESS OF STATE OF REDUNDANT CONFIGURATION..................3 3.0 REDUNDANT BOOTUP... ...........................................4 4.0 TIME SYNCHRONIZED CASE.........................................4 4.1 REDUNDANT OPERATIONS...................... ....................4 4.1.1 OPERATOR INITIATED CONFIGURATION CHANGES.....................4 4.1.2 BACKPLANE INITIATED CONFIGURATION CHANGES....................4 4.1.2.1 OPERATIONAL STATION DISTINCT PROCESSING....................4 4.1.2.2 STANDBY STATION DISTINCT PROCESSING........................5 4.1.2.2.1 STANDBY MAINTENANCE OF TCP REDUNDANCY....................5 5.0 UNSYNCHRONIZED CASE............................................6 5.1 STANDBY DATA SYNCHRONIZATION...................................6 5.2 STANDBY TIMING SYNCHRONIZATION ................................7 6.0 ALTERNATIVES...................................................7 7.0 CRYPTOGRAPHY...................................................7 8.0 REFERENCES.....................................................7 9.0 OTHER BENEFITS.................................................8 9.0 DEFINITIONS OF TERMS..........................................8 10.0 AUTHOR'S ADDRESS...............................................8 APPENDIX A: GLOSSARY OF ACRONYMS....................................9 RFC DONK Minimally Disruptive Router Upgrades November 2001 [1] RFC DONK Minimally Disruptive Router Upgrades November 2001 1. INTRODUCTION Disruption in the network community arises whenever a router or endstation crashes or is brought down for maintenance-usually to perform a software upgrade. In many cases, this disruption proliferates through the network causing other disturbances. To minimize these disruptions, a practical operational framework is presented by using a generally-accepted model of a simple, generic router and maximal use of existing protocol specifications with minimal change. 2. MINIMUM REQUIREMENTS This approach uses redundancy to support on-going operations while critical equipment is upgraded or replaced. To facilitate this, minimum functionality must be present beyond having two like-designed pieces of equipment. To the extent these requirements are met, the switchover will be seamless. If not, the switchover disruption will be minimized to the degree these requirements have been facilitated. [2] RFC DONK Minimally Disruptive Router Upgrades November 2001 2.1 REDUNDANT CONFIGURATION Of course, to support a switchover and redundancy in a meaningful sense, the two pieces of equipment must share physical connectivity with all neighbors and must have the same IP and machine addresses for all general traffic ports. This is very much the definition of redundant configurations. Each station also has a distinct management IP address of which the other is made aware (in-band management). One station is designated "operational," and its redundant counterpart henceforth, is in "standby" mode. (Where redundant addressing cannot be achieved at the machine layer, the standby station must be put in promiscuous mode and after switchover, the standby must transmit an unsolicited ARP response correlating the extant IP address with the new machine address for the benefit of all listening peers.) The degree to which both stations exhibit truly redundant operations can be gauged by how well they share current information. This approach exploits all information on the network in lieu of shared memory to facilitate the same. 2.2 AWARENESS OF REDUNDANT CONFIGURATION The operational and standby equipment must support two indicators. One indicates that a redundant configuration exists (ie, a redundant counterpart is present), while another indicates that the station is functioning properly. For example, support for indicating that a backup station exists, may be done through the human interface whether that be a command issued at the command-line-interface (CLI) or through an SNMP setting or more simply, a bit set to "one" by the station (I_AM_HERE). In this case, the bit must be readable by the station's redundant counterpart. 2.3 AWARENESS OF STATE OF REDUNDANT COUNTERPART A generic router typically has an operational-signal (LED) which indicates that its software is actively performing the routing function. The router's software turns this signal off during exception handling such as a software crash thereby indicating a software malfunction (possibly made manifest due to a hardware fault). An operator-initiated deactivation also resets this operational-signal. This indicator must be readable by the station's redundant counterpart. [3] RFC DONK Minimally Disruptive Router Upgrades November 2001 3. REDUNDANT BOOTUP As each station boots, and if it detects the presence of a redundant counterpart, it immediately signals that it is operationally unready. It then waits until it detects the same signal from its counterpart. If so, it then signals readiness and awaits the detection of the same from its counterpart. It does so until it detects readiness or a timeout value expires. This timeout value may be as great as the default TCP timeout for the station but for practical considerations, it should be set to two minutes. If this step times out, the station is stated as "standby" and "unsynchronized" but may otherwise proceed in its bootup. If this step does not time-out, each station is time-synchronized with the other and may proceed in its bootup in much the same way it would without redundancy concerns. The station with the lower management IP address is stated as "operational". The station with the higher management IP address is stated as "standby." 4. TIME SYNCHRONIZED CASE 4.1 REDUNDANT OPERATIONS Operations at the station generally consist of two parts. The first is operator initiated configuration changes such as adding a static route. It is imperative that the redundancy configuration has this knowledge instantiated at both stations at the same time to maintain pure redundancy. The second part comprises configuration changes such as routing table updates which are made manifest from external sources (peer routers) connected on the general switching ports (backplane redundancy). 4.1.1 OPERATOR INITIATED CONFIGURATION CHANGES Configuration changes and commands issued at the station must be conveyed to the other station. This is simply done for a command line interface by establishing a telnet session to the redundant counterpart's management IP address and echoing all commands as they are introduced. The responses to these commands may be summarily ignored. Configuration changes and commands issued directly to the station via SNMP may be copied and sent to the redundant counterpart by copying the SNMP packet upon its reception by the station and sending this copy to its redundant counterpart (via its distinct management IP address). 4.1.2 BACKPLANE INITIATED CONFIGURATION CHANGES The processing for handling backplane initiated configuration changes is predicated on the operational-signal of the redundancy counterpart. Each time a packet is sent, this bit is read to determine if the counterpart is functioning as the operational station. If this station's redundancy counterpart is not the "standby" station, it may be safely assumed that this station is the "standby". [4] RFC DONK Minimally Disruptive Router Upgrades November 2001 4.1.2.1 OPERATIONAL-STATION DISTINCT PROCESSING The operational station operates in the same manner as in the non-redundant configuration case. In this way, the operational software's only modification is to detect, with each packet to be transmitted, whether or not it is in the operational mode. Its software is minimally altered and therefore its inherent reliability is not diminished as a result of introducing redundancy functionality. 4.1.2.2 STANDBY-STATION DISTINCT PROCESSING The standby station executes the same software as the operational station. However, in layer-one of network handling, it does not transmit its packet but does return a satisfactory status that it has transmitted the packet. The net effect is the same, inasmuch as the operational station DID send the packet. They may now await an acknowledge (in TCP parlance). 4.1.2.2.1 STANDBY MAINTENANCE OF TCP REDUNDANCY The station may be seen as an Input/Output (I/O) engine which is only based on packet timing mark. We may then work at that timing granularity. Initial timing synchronization was achieved in paragraph 3. Packets sent from backplane stations will tend to re-calibrate the two stations to the same packet timing. Finally, the TCP timeout window is usually very large which also confirms our ability to keep the stations configuration redundant to a very high level of confidence. Since the timing granularity is only held to within one packet time, it is possible that an acknowledge can be seen at the standby before the sequence-to-be-acknowledged, has been "sent." Therefore, it is required that the TCP-input algorithm (but not its interface) be modified so that if this syndrome is detected, the acknowledge packet is buffered and recycled into the input stream after TCP outputs the sequence for which the acknowledge is hinged. Now, should the operational station become defective, its exception handler will mark it as "standby". Since its redundant counterpart checks this indicator every time it sends a backplane packet, switchover of operations to what was the standby, can proceed within one packet time. All operations will continue since protocols such as BGP, OSPF, and others run on TCP. The crashed or operator-terminated station may now have its software upgraded. When it reboots, it will try the synchronization steps of paragraph 3 and furthermore, it will timeout (since the other station is continuing on in operational-mode) causing the newly introduced station to be stated as "unsynchronized." As an unsynchronized-standby processor, its transmissions are never allowed onto the backplane connections. [5] RFC DONK Minimally Disruptive Router Upgrades November 2001 5 UNSYNCHRONIZED CASE In the unsynchronized case, the station has failed to perform the steps of paragraph 3. It can be assumed that the failure was because the redundancy counterpart of this station was already operational and therefore it did not collaborate in the time synchronization and testing (toggling) of its operational-indicator. This is the case of a station joining the redundancy configuration late and is therefore regarded as a "late-joiner" case. The same requirements apply for the late-joiner. The operational station can detect the presence of the late-joiner through the means of paragraph 2.3. What remains is for the standby station to achieve data and timing synchronization. These steps may be processor intensive and should therefore be tied to a scheduler for a time in which critical network operations are expected to be at a minimum. In this way, during the steps of time and data synchronization, configuration changes are also minimized. 5.1 STANDBY DATA SYNCHRONIZATION The standby continues its bootup and then issues an SNMP MIB WALK (GET) to the operational station's management IP address of object number one. This will return all objects comprising the operational station's MIB definition. On reception, the standby must convert the GET_RESULT to a SET command and issue it to itself. Typically, routers do not issue SNMP-GETs to other routers. There is nothing in the SNMP RFC to preclude this however. The operational station may assume that an SNMP-MIB WALK ALL from the standby station indicates that a late joiner synchronization is in-process. Also, some SNMP MIB objects are read-only. This prevents operators from setting data that reflect non-operator-controlled-configuration- items such as the system-uptime of the router. While this is a noble end in itself, the standby may trust that the operational station reflects this data dutifully. Therefore the data at the operational station has the same integrity as if the standby had gotten it directly from its source. Therefore, during data synchronization by the standby, SNMP MIB objects that are read-only may be written in the standby. This is a manageable, very specific, software change. The result for system-uptime, for example, is that the uptime reflects the uptime of the system as opposed to that of the individual station. Finally, files used to contain configuration information may now be obtained by the standby station by issuing appropriate FTP commands targeted to the primary's management IP address. All that remains for complete redundancy at this point is achieving timing synchronization of the backplane. [6] RFC DONK Minimally Disruptive Router Upgrades November 2001 5.2 STANDBY TIMING SYNCHRONIZATION Once data synchronization is achieved, the two stations re-establish telnet connectivity for in-band management redundancy. The first command sent initiates timing synchronization. In this, both stations state that timing synchronization does not exist. Therefore, both are considered operationally unready. Since neither is operational during timing resynchronization, neither transmits packets to the backplane. Furthermore, TCP sessions are brought down and resurrected so that initial sequence numbers may be established consistently in both stations. Initial TCP sequence numbers can be unrandomized and statically based on IP addresses of the router peer to which connectivity is sought. This is because peer routers are typically in the same IP domain, usually point-to-point connected and seldom more than one hop away. Because no routing is needed between them, no router loops, no transient packets wandering the network are seen, especially during a catastrophic failure and thus the necessity for randomizing the initial sequence numbers is obviated. Alternatively, during the SNMP-GET-ALL step of data synchronization (paragraph 5.1), should the MIB for TCP contain current sequence numbers, backplane activity can be suspended until the operational station performs an SNMP-GET of the standby's TCP sequence numbers to ensure they are both coherent before allowing network activity on the backplane to proceeed. 6 ALTERNATIVES TO THIS APPROACH To preclude outages due to a hardware-oriented catastrophic event, designers have produced dual-station systems that stay synchronized by using a common clock and shared memory. This is a very expensive solution which does not address problems due to a hardware malfunction of the clock. If the hardware-oriented problem becomes manifest in a memory or semaphore "stuck-high" in the shared memory, it becomes a problem for the standby almost immediately. Finally, approaches of this kind encompass timing synchronization in the Megahertz calibration range when the inputs and outputs of the system need only be true at the packet-time granularity. 6 OTHER BENEFITS Since the investment in redundancy is hopefully never needed owing to inherently reliable hardware and software, the standby station may never be used in a theoretically 100% reliability case. Therefore, other uses can be switched in such as backplane debugging wherein the standby is used as a sniffer to promiscuously record and display packets appearing on backplane links. This can be an incredible asset in tuning and debugging the network. If there is enough station throughput, the sniffing can be done simultaneous with the functionality already presented. 7 CRYPTOGRAPHY As cryptography is imposed by the station operator so to will it be invoked on the redundant station vis-a-vis the telnet echo of the same. Specialized cryptography may also be created to facilitate this echoing and the echoing of SNMP management requests to the management port. [7] RFC DONK Minimally Disruptive Router Upgrades November 2001 8 REFERENCES o Structure and Identification of Management Information for TCP/IP-based Internets, RFC 1155, M. Rose, K. McCloghrie o Management Information Base for Network Management of TCP/IP-based Internets, RFC 1156, K. McCloghrie, M. Rose o SNMPv2 Management Information., RFC 2013, K. McCloghrie o Simple Network Management Protocol (SNMP), RFC 1157, J. Case, M. Fedor, M. Schoffstall, C. Davin o Management Information Base for Network Management of TCP/IP-based Internets: MIB-II, RFC 1158, M. Rose o Transmission Control Protocol, RFC 793, J. Postel o Requirements for Internet Hosts--Communication Layers, RFC 1122, R. Braden, ed. o TCP Extensions for High Performance, RFC 1323, V. Jacobson, R. Braden, D. Borman o Internet Protocol, RFC 791, J. Postel o Ethernet Address Resolution Protocol, RFC 826, D. Plummer 9 DEFINITIONS OF TERMS Internet address: A 32-bit address assigned to hosts using TCP/IP. Operational station: Of two completely redundant stations, the station actually transmitting packets. protocol: A formal description of messages to be exchanged and rules to be followed for two or more systems to exchange information. router: A system responsible for making decisions about which of several paths network (or Internet) traffic will follow. To do this it uses a routing protocol to gain information about the network, and algorithms to choose the best route based on several criteria known as "routing metrics." In OSI terminology, a router is a Network Layer intermediate system. Standby Station: of two completely redundant stations, the station which merely thinks it is transmitting packets but is disallowed at the network driver layer. It relies on the operational stations transmission. Telnet: The virtual terminal protocol in the Internet suite of protocols. Allows users of one host to log into a remote host and interact as normal terminal users of that host. Unsynchronized station: A station unable to complete time synchronization with its redundant counterpart. 9 AUTHOR'S ADDRESS Daniel D. Kilsdonk 65 Lake Shore Drive North Westford, MA 01886 Phone: (978) 692-3383 EMail: dan@prospeed.net APPENDIX A: GLOSSARY OF ACRONYMS FTP: File Transfer Protocol. The Internet protocol (and program) used to transfer files between hosts. See FTAM. IP: Internet Protocol. The network layer protocol for the Internet protocol suite. IS-IS: Intermediate system to Intermediate system protocol. The OSI protocol by which intermediate systems exchange routing information. MIB: Management Information Base. A collection of objects that can be accessed via a network management protocol. OSPF: Open Shortest Path First. A "Proposed Standard" IGP for the Internet. See IGP. SNMP: Simple Network Management Protocol. The network management protocol of choice for TCP/IP-based internets. TCP: Transmission Control Protocol. The major transport protocol in the Internet suite of protocols providing reliable, connection- oriented, full-duplex streams. Uses IP for delivery. See TP4. INTERNET-DRAFT expires 12/14/2002