INTERNET-DRAFT A.Palanivelan Category: Historic Cisco Systems Expires: April 20, 2011 Oct 22, 2010 Bidirectional Forwarding Detection (BFD) with Graceful Restart draft-palanivelan-bfd-v2-gr-08.txt Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright and License Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. A.Palanivelan Expires Apr 20, 2011 [Page 1] INTERNET DRAFT draft-palanivelan-bfd-v2-gr-08.txt Oct 22, 2010 Abstract This document proposes an extension for Bidirectional Forwarding Detection (BFD) to support Graceful Restart, in complementing Graceful Restart support of the underlying protocol.This would work consistently, irrespective of the bfd mode or protocol or the type of restart and, most importantly, the vendors design and implementation. This document describes in detail the challenges to BFD in surviving a graceful restart, and a generic solution to such challenges. Table of Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1 Planned Restarts with control protocols . . . . . . . . . . 4 3.2 BFD Co-existing with Broadband configurations . . . . . . . 5 4. Extensions to BFD . . . . . . . . . . . . . . . . . . . . . . . 5 4.1 Version (Vers) . . . . . . . . . . . . . . . . . . . . . . . 6 4.2 Diagnostic (Diag) . . . . . . . . . . . . . . . . . . . . . 6 4.3 State (Sta) . . . . . . . . . . . . . . . . . . . . . . . . 7 4.4 My Restart Interval . . . . . . . . . . . . . . . . . . . . 7 4.5 Your Restart Interval . . . . . . . . . . . . . . . . . . . 7 5. State Machine for BFD with GR Support . . . . . . . . . . . . . 7 6. Theory of Operation . . . . . . . . . . . . . . . . . . . . . . 9 6.1. Session Establishment and GR Timer exchange . . . . . . . . 9 6.2. Remote Neighbor Restart and Recovery . . . . . . . . . . 11 7 Security Considerations . . . . . . . . . . . . . . . . . . . 13 8 IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 9 References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 9.1 Normative References . . . . . . . . . . . . . . . . . . 14 9.2 Informative References . . . . . . . . . . . . . . . . . 14 10 Author's Addresses . . . . . . . . . . . . . . . . . . . . . . 14 A.Palanivelan Expires Apr 20, 2011 [Page 2] INTERNET DRAFT draft-palanivelan-bfd-v2-gr-08.txt Oct 22, 2010 1 Introduction The Bidirectional Forwarding Detection protocol [BFD] provides a mechanism for liveness detection of arbitrary paths between systems. It is intended to provide low-overhead, short-duration detection of failures in the path between adjacent forwarding engines, including the interfaces, data link(s), and to the extent possible the forwarding engines themselves. It operates independently of media, data protocols, and routing protocols. An additional BFD goal is to provide a single mechanism that can be used for liveness detection over any media, at any protocol layer, with a wide range of detection times and overhead, to avoid a proliferation of different methods. Although Graceful Restart (GR) was considered for BFD, but - so as to keep the protocol simple - it was not included in the BFD standard. This document records the Graceful Restart discussions, and presents them as an extension to BFD. The extensions to BFD documented in this draft introduced in this draft would aid BFD in complementing the GR capabilities of protocols such as OSPF, and also in providing a consistent behavior for planned/unplanned restarts irrespective of the underlying protocols. Further, these extensions would provide a solution that works for all types of BFD implementation. 2 Overview The Bidirectional Forwarding Detection [BFD] specification defines a protocol with simple and specific semantics. Its sole purpose is to verify liveliness of the connectivity between a pair of systems, for a particular data protocol across a path (which may be of any technology, length, or OSI layer). The promptness of the detection of a path failure can be controlled by trading off protocol overhead and system load with detection times. BFD is assumed to be working fine without a need for any GR support. However, BFD deployments show that the different types of implementations in the products and their inherent mechanisms lead to issues with BFD, especially in surviving GR. It is true that prioritizing BFD would make sure the other CPU intensive processes do not fail BFD, but this may not be possible as there may be other higher priority processing that can't be ignored. For example, perhaps the existing subscriber connections can't be given a lesser priority. The extensions introduced in this draft would aid BFD in complementing the GR capabilities of protocols such as OSPF and also A.Palanivelan Expires Apr 20, 2011 [Page 3] INTERNET DRAFT draft-palanivelan-bfd-v2-gr-08.txt Oct 22, 2010 provide a consistent behavior for planned/unplanned restarts for the underlying protocols. 3 Motivations Though the existing drafts discuss BFD interactions with various applications with Graceful Restart and ways of implementing in serving successful GR, the RFCs have some exceptions and caveats applied. This draft discusses the issues in the following scenarios and provides a generic solution that would scale for future applications, and provides a solution that works for all types of BFD implementations. This section of the document discusses the following issues, that are commonly seen in BFD deployments: * Planned restart with control protocols * BFD co-existing with Broadband configurations This document provides a solution for Graceful restart mechanism in general, for BFD. 3.1 Planned Restarts with control protocols The BFD RFCs suggest administratively disabling BFD prior to the start of GR. But, this works only for planned restarts and not for unplanned restarts. This also does not work for a protocol such as IS-IS that cannot signal a planned restart. For a planned restart where a control protocol can signal before restarting, if a BFD session failure occurs during the restart, it is recommended in the existing document(s) that such a planned restart should not be aborted and the session failure should not result in a topology change being signaled in the control protocol. Control protocols that cannot signal a planned restart depend on the recently restarted system to signal the Graceful Restart prior to the control protocol adjacency timeout. In most cases, whether the restart is planned or unplanned, it is likely that the BFD session will time out prior to the onset of Graceful Restart, causing a topology change to be signaled. This type of implementation could impact non-stop routing and non-stop forwarding support using GR-enabled protocols, and provides an opportunity to review the existing BFD implementations and propose improvements to them. A.Palanivelan Expires Apr 20, 2011 [Page 4] INTERNET DRAFT draft-palanivelan-bfd-v2-gr-08.txt Oct 22, 2010 3.2 BFD Co-existing with Broadband configurations In a real-time scenario with Broadband configurations, it is highly likely that the BFD sessions do not survive a Graceful restart. Assume a PE router that has active DHCP sessions with a large number of clients (say 16k). During a planned restart, it is likely that the DHCP clients request the server (restarting router) to renew client IP addresses. When the router is restarting, these requests do not reach it. But, when these requests reach the router when it has just come up, it will treat them at a high priority and respond to them. When we have thousands of such requests to the restarting router, the router will spend a major part of its first second of uptime in addressing them. In this scenario, a control protocol like OSPFv2 that has GR enabled [OSPF-GRACE], could withstand the restart for the specified restart interval (as it will be in seconds) and it is likely to survive the restart, maintaining its forwarding plane. In the same scenario, if BFD is enabled for OSPFv2 for an unplanned restart, the (BFD) neighbor router will be expecting BFD control packets in a milliseconds interval; during the restart process it is likely to timeout, also impacting the associated OSPFv2 adjacency and resulting in a significant loss of traffic. The scenario will be the same for BFD with a protocol such as IS-IS [IS-IS-GRACE], where the problem is likely to be seen even for a planned restart. 4. Extensions to BFD This document introduces a new diag value to indicate that the neighbor is restarting and provisions to configure Graceful Restart timers. The Generic BFD Control Packet Format shown below introduces two additional sections "My Restart Interval" and "Your Restart Interval". A.Palanivelan Expires Apr 20, 2011 [Page 5] INTERNET DRAFT draft-palanivelan-bfd-v2-gr-08.txt Oct 22, 2010 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Vers | Diag |Sta|P|F|C|A|D|M| Detect Mult | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | My Discriminator | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Your Discriminator | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Desired Min TX Interval | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Required Min RX Interval | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Required Min Echo RX Interval | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | My Restart Interval | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Your Restart Interval | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4.1 Version (Vers) The version of BFD defined by this draft, that has support for GR configuration and a diag for neighbor restarting state, has a value of 2.The version information is to clearly indicate to BFD peers, with which we intend to make neighborship, that this system has BFD GR capability. The implementation as defined by this document, by design is expected to have comaptibility with the existing BFD implementations (version 1/0). 4.2 Diagnostic (Diag) A diagnostic code specifying the local system's reason for the last change in session state. A new diag value 9 for "Neighbor Restarting" is introduced in this document. Values are: 0 -- No Diagnostic 1 -- Control Detection Time Expired 2 -- Echo Function Failed 3 -- Neighbor Signaled Session Down 4 -- Forwarding Plane Reset 5 -- Path Down 6 -- Concatenated Path Down A.Palanivelan Expires Apr 20, 2011 [Page 6] INTERNET DRAFT draft-palanivelan-bfd-v2-gr-08.txt Oct 22, 2010 7 -- Administratively Down 8 -- Reverse Concatenated Path Down 9 -- Neighbor Restarting 10-31 -- Reserved for future use This field allows remote systems to determine the reason that the previous session failed. 4.3 State (Sta) This document introduces a new BFD session state, Neigh_Restart. The BFD session state as seen by the transmitting system Values are: 0 -- AdminDown 1 -- Down 2 -- Init 3 -- Up 4 -- Neigh_Restart 4.4 My Restart Interval This is the restart interval, in microseconds, of the transmitting system advertised to the remote system. In the case of a restart (of transmitting system), the remote system is expected to keep the BFD session up for this duration of time. This field need to have a value greater than the detection time. A value of 0 indicates to the remote system that this system has BFD-GR disabled. 4.5 Your Restart Interval The restart interval,in microseconds, received from the corresponding remote system. In the case of a restart (of remote system), the transmitting system is expected to keep the BFD session up for this duration of time. This field need to have a value greater than the detection time. 5. State Machine for BFD with GR Support The BFD state machine is quite straightforward and explained in detail by [BFD]. The BFD RFC describes different states for BFD as: Down, Init, Up, AdminDown. Each system communicates its session state in the State (Sta) field of the BFD Control packet, and that received state, in combination with the local session state, drives the state machine. Please refer [BFD] for state machine diagram and detailed explanations of the state transitions. The following diagram provides an overview of the state machine, for state transitions for BFD with GR support (where "Your Restart Interval" has a non-zero value greater than Detection time). This A.Palanivelan Expires Apr 20, 2011 [Page 7] INTERNET DRAFT draft-palanivelan-bfd-v2-gr-08.txt Oct 22, 2010 document introduces a new state, NEIGH_RESTART to BFD state machine. The "Your Restart Interval" must have a value greater than the detection time value. If this value is zero or less than the detection time value, the state transitions should completely follow the BFD state machine as defined by [BFD]. The notation on each arc represents the state of the remote system (as received in the State field in the BFD Control packet) or indicates the expiration of the Detection Timer. +-----+ | | INIT, UP | v +-----------------------------+ +-------->|State = UP, Diag = 0, | | |Timer = "Detect Interval" |<----+ | | | | | +-----------------------------+ | | | | | | | | | | | {Neighbor Restart} | | | | | | | | | |INIT,UP | | {Neighbor Restart | | | complete} | | | | | | v | +-------+ | +---------------------------------+ +-----| | | | State = NEIGH_RESTART, Diag = 9,| | | | | | Timer = "Your Restart interval" | DOWN | | INIT | | +---------------------------------+ +---->| | | | +-------+ |ADMIN DOWN, | | |DOWN, | | |TIMER | | v ADMIN DOWN,| | +-------+ DOWN,| | | | TIMER | ADMIN DOWN,| | DOWN |<---------------+ TIMER | | | +------>| | +-------+ | ^ | | UP, ADMIN DOWN, TIMER | | +---+ Note1: This state diagram holds for BFD with GR extension as described in this document, which implies that "your Restart Interval" has a value greater than the Detection time value of the established session. A.Palanivelan Expires Apr 20, 2011 [Page 8] INTERNET DRAFT draft-palanivelan-bfd-v2-gr-08.txt Oct 22, 2010 Note2: The labels of the diagram in braces {} indicate the GR- Specific events on the remote neighbor (Restart/Restart complete). The State Transitions involving the new state, Neigh_Restart is explained in the next section of this document. 6. Theory of Operation High-availability is a very popular and a much sought after functionality in the network deployments. This is the ability of the system to switchover to the standby (Control plane, Data plane) during planned/unplanned restart of the system, thus ensuring that the traffic loss if any is minimal. The system architecture and the GR support of the routing protocols contribute to the success of such a design in live networks. When using a routing protocol that is GR enabled, a system that has support for high-availability will continue to forward traffic (with minimal loss) during a neighbor restart. When BFD is enabled on a routing protocol in such a setup, it is expected to assist the process than disturb it. With the current BFD implementations, the BFD sessions do not survive a restart under different conditions. An Unplanned restart or a planned restart with a protocol such as IS-IS that cannot signal about restart, are some of the situations where the BFD configuration is likely to impact a high-availability situation. Though there are certain implementations adopted by various companies to make BFD survive restarts, there is no uniform method of achieving this; it is likely to fail when interoperating with routers from other companies. This document proposes a standard way of achieving this objective. This document recommends the introduction of a new diag value (9 for "Neighbor restarting"), new version (2 for GR supported BFD) and two additional sections to the BFD packet format. This design is expected to provide a capability to BFD in withstanding restart scenarios, complementing the associated protocol. This should work consistently, irrespective of the BFD mode or protocol or the type of restart. 6.1. Session Establishment and GR Timer exchange The BFD session establishment follows the procedures as described in [BFD]. If the technology described by this document were to be A.Palanivelan Expires Apr 20, 2011 [Page 9] INTERNET DRAFT draft-palanivelan-bfd-v2-gr-08.txt Oct 22, 2010 implemented, the BFD control packets should have the following field(s) with the values given below: The Version field (Section 4.1) should have a value of 2,indicating the support for GR. A new section to the BFD control packet format,"My Restart Interval" (Section 4.3) need to have a non-zero value that is greater than the detection time. A new section to the BFD control packet format,"Your Restart Interval" (Section 4.4) need to have a non-zero value that is greater than the detection time. The "My Restart Interval" and "Your Restart Interval" are used in exchanging the GR timers information between the systems. "My Restart Interval" is the time interval in microseconds, that this system expects its remote system to wait for, before bringing down its BFD session with that system. "Your Restart Interval" is the time interval in microseconds, specified by the remote system, that it expects this system to wait for, before bringing down its BFD session with the remote system. Once the packet exchanges are complete and the BFD sessions are up, every BFD session will have information about the time interval its remote system will wait during a Restart, and also the time interval this system has to wait when the remote system restarts. The "My Restart Interval" and the "Your Restart Interval" values can be modified after the session is up, just like the other BFD parameters, and in this case the packet exchanges will sync up the restart interval times (My and Your) on both the sides appropriately. The exchange of GR Specific parameters during BFD session establishment is indicated in the diagram below. The diagram shows only part of control packets, for the purpose of clarity. A.Palanivelan Expires Apr 20, 2011 [Page 10] INTERNET DRAFT draft-palanivelan-bfd-v2-gr-08.txt Oct 22, 2010 System A System B | | | | |----------------------------------->| | {BFD.version = 2 | | BFD.MyRestartInterval = AAAA | | BFD.YourRestartInterval = 0 } | | | |<-----------------------------------| | {BFD.version = 2 | | BFD.MyRestartInterval = BBBB | | BFD.YourRestartInterval= AAAA } | | | |----------------------------------->| | {BFD.version = 2 | | BFD.MyRestartInterval = AAAA | | BFD.YourRestartInterval= BBBB } | | | The initial BFD packet exchange between local system and remote system will have the exchanged values for the "My Restart Interval" or 0. The "Your Restart Interval" will reflect the value received in "My Restart Interval" from the corresponding remote system, or be Zero if that value is not set (value of 0). A value of Zero for "Your Restart Interval" means that the BFD GR is disabled at the remote end, and similarly a value of Zero for "My Restart Interval" means that BFD GR is disabled at the transmitting system. 6.2. Remote Neighbor Restart and Recovery When the BFD neighbors have established their BFD sessions (with their BFD GR timer values exchanged as described above), the following set of operations take place when the remote neighbor attempts a graceful restart (for example, with a GR enabled routing protocol like OSPFv2/IS-IS tied with BFD). Once the packet exchanges are complete and the BFD sessions are up, every BFD session will have information about the time interval its remote system will wait during a Restart, and also the time interval this system has to wait when the remote system restarts. For clarity, let us revisit the BFD timers and BFD detection time as described in [BFD]. The Detection Time (the period of time without receiving BFD packets after which the session is determined to have failed) is not carried explicitly in the protocol. Rather, it is calculated independently A.Palanivelan Expires Apr 20, 2011 [Page 11] INTERNET DRAFT draft-palanivelan-bfd-v2-gr-08.txt Oct 22, 2010 in each direction by the receiving system based on the negotiated transmit interval and the detection multiplier. This means that a BFD control packet should be received from the remote neighbor within the detection time. When the BFD control packet is not received from the remote neighbor within this time, the timer expiry should bring the BFD session state to down. In the case of Graceful Restart scenario, we may end up in a situation that the routing protocol (like OSPFv2) is in graceful restart mode with the remote neighbor restarting, and the system not receiving BFD control packets within the detection time, due to other CPU intensive processes in the system. The technology described by this document addresses this issue. When the set of systems had their BFD sessions established, with GR support as described in this document, when the remote neighbor restarts it will set the BFD diagnostics field to a value of 9 (Neighbor Restarting) in the control packet to its neighbor (local system). When the local system receives BFD control packet with diag field set to 9, the local system will update its timer to the previously exchanged value of "Your Restart Interval". This effectively means that the local system should wait for a BFD control packet for "Your Restart Interval" instead of Detection time. This will be the case as long as the diag field from the remote neighbor is 9.The BFD moves from "Up" to "Neigh_Restart" state. When the restart is complete and the remote neighbor recovers, the remote neighbor should set the Diagnostics field to a value of 0. The local system, on receiving BFD control packets with diag field set to 0, understands that the restart process for remote neighbor is complete and hence will revert the timer back to detection time (by calculation) and then expect control packets from the neighbor within this detection time.The BFD moves from "Neigh_Restart" to "Up" state. If the remote neighbor does not recover in time to send a BFD control packet within the previously communicated "Your Restart Interval", the timer expiry should bring the session down.The BFD moves from "Neigh_Restart" to "Down" state. It is important to have a meaningful values for the "Your Restart Interval" and "My Restart Interval" to complement the GR timers in the associated protocol. A.Palanivelan Expires Apr 20, 2011 [Page 12] INTERNET DRAFT draft-palanivelan-bfd-v2-gr-08.txt Oct 22, 2010 7 Security Considerations Security considerations discussed in [BFD], [BFD-1HOP] apply to this document. 8 IANA Considerations At this time, no IANA action is requested. However, if this technology were to be implemented, it would need two sections added to the BFD generic packet format namely "My Restart Interval" and "Your Restart Interval" as described in Section 4 of this document. This document also defines a Diag value of 9 to be used to specify "Neighbor Restarting" in addition to the "BFD Diagnostic Codes" defined by [BFD] and referred in Section4.2 of this document.If this technology were to be implemented, the "BFD Diagnostic Codes" would need to be updated as: Value BFD Diagnostic Code Name ----- ------------------------ 0 -- No Diagnostic 1 -- Control Detection Time Expired 2 -- Echo Function Failed 3 -- Neighbor Signaled Session Down 4 -- Forwarding Plane Reset 5 -- Path Down 6 -- Concatenated Path Down 7 -- Administratively Down 8 -- Reverse Concatenated Path Down 9 -- Neighbor Restarting 10-31 -- Reserved for future use A.Palanivelan Expires Apr 20, 2011 [Page 13] INTERNET DRAFT draft-palanivelan-bfd-v2-gr-08.txt Oct 22, 2010 This document also defines a new BFD state, Neigh_Restart" with a value of 4 to be used to specify this new state.This new state "Neigh_Restart" in addition to the "BFD States" defined by [BFD] and referred in Section4.3 of this document.If this technology were to be implemented, the "BFD States" would need to be updated as: 0 -- AdminDown 1 -- Down 2 -- Init 3 -- Up 4 -- Neigh_Restart 9 References 9.1 Normative References [BFD] Katz, D., and Ward, D., "Bidirectional Forwarding Detection", RFC 5880, June, 2010. [BFD-1HOP] Katz, D., and Ward, D., "BFD for IPv4 and IPv6 (Single Hop)", RFC 5881, June, 2010. 9.2 Informative References [IS-IS-GRACE] Shand, M., and Ginsberg, L., "Restart signaling for IS-IS", RFC 5306, October 2008. [OSPF-GRACE] Moy, J., et al, "Graceful OSPF Restart", RFC 3623, November 2003. 10 Author's Addresses Palanivelan A Cisco Systems, Bangalore,India. EMail: apvelan@cisco.com A.Palanivelan Expires Apr 20, 2011 [Page 14]