INTERNET-DRAFT A.Palanivelan Intended Status: Best Current Practice EMC Corporation Expires: Oct 12,2013 Apr 10,2013 Design Recommendations for bfd to survive Planned Graceful restart draft-palanivelan-bfd-gr-rec-00 Abstract Bidirectional Forwarding Detection (bfd) defined in RFC5880 is comprehensive and has sufficient mechanisms to overcome challenges in surviving Graceful Restart (Planned). However, implementations with architectural limitation were found to influence a BFD failure, in a process intensive environment.This memo outlines design options to overcome the challenges, within the existing bfd framework, in surviving planned GracefulRestart. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html A.Palanivelan Expires Oct 12,2013 [Page 1] INTERNET DRAFT draft-palanivelan-bfd-gr-rec-00 Apr 10,2013 Copyright and License Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Applicability Statement . . . . . . . . . . . . . . . . . . . . 3 3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 3 4 Requirements for BFD to survive Planned Graceful Restart . . . 4 5 Design Recommendations for BFD to survive Planned Graceful Restart . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5.1 Recommendation 1: No BFD Down during GracePeriod . . . . . . 5 5.2 Recommendation 2: BFD Detect timer value == Restart interval/timer value . . . . . . . . . . . . . . . . . . . . 5 5.3 Recommendation 3: BFD Detect timer value == 3 x Restart interval/timer value . . . . . . . . . . . . . . . . . . . . 6 5.4 Recommendation 4: BFD Detect timer value == predefined (Fixed High) value . . . . . . . . . . . . . . . . . . . . . 6 6 Security Considerations . . . . . . . . . . . . . . . . . . . . 7 7 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7 8 References . . . . . . . . . . . . . . . . . . . . . . . . . . 7 8.1 Normative References . . . . . . . . . . . . . . . . . . . 7 8.2 Informative References . . . . . . . . . . . . . . . . . . 7 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 8 A.Palanivelan Expires Oct 12,2013 [Page 2] INTERNET DRAFT draft-palanivelan-bfd-gr-rec-00 Apr 10,2013 1 Introduction The Bidirectional Forwarding Detection protocol [BFD] provides mechanism for liveness detection of arbitrary paths between systems. It is intended to provide low-overhead, short-duration detection of failures in the path between adjacent forwarding engines, including the interfaces, data links, and to the extent possible, the forwarding engines themselves. It operates independently of media, data protocols, and routing protocols. BFD has sufficient mechanisms to address the challenges in surviving a planned graceful restart. In a highly process intensive environment, architectural/design limitations are found to influence bfd and hence negatively impact the underlying protocol session, during a planned graceful restart. This document presents a summary of the challenges to BFD in surviving a planned graceful restart, and outlines possible design solutions within the existing framework. The purpose of this document is to suggest design options to the implementers, that have known architectural limitations, but need bfd to survive a planned restart in a process intensive environment. 1.1 Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 2. Applicability Statement This document applies to bfd implementations in products with architectural limitations in handling multiple process intensive features. The recommendations in this document are focused in enabling bfd to survive Planned Graceful Restart of routing protocols. h 3 Problem Statement The need for BFD and its usefulness are well appreciated and replicated by increasing implementations with multiple protocol applications. However, due to architectural limitations in handling simultaneous process intensive features, bfd MAY negatively impact the underlying protocol sessions. A.Palanivelan Expires Oct 12,2013 [Page 3] INTERNET DRAFT draft-palanivelan-bfd-gr-rec-00 Apr 10,2013 When BFD enabled protocols co-exist with scaled up broadband configurations, servicing the established connections (with clients) cannot be compromised. BFD surviving a Planned Graceful restart depends largely on the ability of the devices involved, in providing enough room for BFD to exchange the control packets. An example would be, surviving a planned Graceful Restart in a router that has BFD enabled routing protocol, co-exiting with a process intensive feature like deep packet inspection enabled. Another example would be, a planned restart on a PE router (with bfd enabled OSPFv2 adjacencies) servicing a large number of active broadband client connections. In both of these cases, On a planned restart, if the restarting router has an architectural limitation in handling multiple process intensive applications, it cannot ensure enough cycles/space to its bfd application to send out the periodic control packets to its peer. The bfd peer would time out and hence influence its underlying protocol to destroy its session with the restarting router. When BFD is not enabled with the control protocol (eg.,OSPFv2) that has GR enabled[OSPF-GRACE], the router would withstand the restart for the specified restart interval/Grace period (as this will be in the order of seconds). It is more likely to survive the restart, maintaining its forwarding plane and hence traffic. 4 Requirements for BFD to survive Planned Graceful Restart Bidirectional forwarding Detection [BFD] clearly defines the requirements for maintaining an established BFD Session in tandem with the underlying application (any layer/any protocol). * BFD is time intensive and it is important that the timer values negotiated, sufficiently address network conditions. * BFD has a higher priority over other processes. * Architectural leverage in ensuring that the BFD control packets are not dropped, when simultaneously handling multiple process intensive applications. In addition to the general requirements given above, For surviving a planned Graceful Restart of protocols, the requirement (at system level) is to ensure that BFD does not negatively impact the neighborship of the local system with the restarting peer and hence the underlying protocol session. A.Palanivelan Expires Oct 12,2013 [Page 4] INTERNET DRAFT draft-palanivelan-bfd-gr-rec-00 Apr 10,2013 5 Design Recommendations for BFD to survive Planned Graceful Restart The following design options can help in ensuring that BFD survives a planned Graceful restart and hence not negatively influencing the underlying protocol sessions and hence traffic. The following recommendations SHALL suit the implementations, that have architectural limitations in not able to ensure, periodic exchange of bfd control packets between a local system and restarting peer. OSPFv2-BFD is considered as an example in explaining the below recommendations. 5.1 Recommendation 1: No BFD Down during GracePeriod The local (helper) router, on receiving the restart notification from restarting peer router, SHALL set a global flag and not bring down its BFD session, when this flag is set.When the restart is complete, the flag is reset and the BFD operation is normal. On receipt of the restart notification from the remote peer (Grace LSA as in OSPFv2), Set Global gr Flag. On successful recovery after graceful restart or on expiry of the grace period, reset the Global gr Flag. If Global gr flag not set, BFD Detect timer value == negotiated timer value. On timer expiry, initiate Session destroy to peer (restarting router). If Global gr flag is set, ignore timer expiry and restart the timer to peer(restarting router). 5.2 Recommendation 2: BFD Detect timer value == Restart interval/timer value The local (helper) router, on receiving the restart notification from restarting peer router, sends a interrupt to bfd process, which would set a global flag and also set the timer to the highest value of the timer of its underlying protocol(s). When the restart is complete, the timer is set to the previously negotiated values and the BFD operation is normal. On receipt of the restart notification from the remote peer (Grace LSA as in OSPFv2), notify BFD which would set a Global gr Flag. On successful recovery after graceful restart or on expiry of the grace period, notify BFD which would reset the Global gr Flag. If Global gr flag not set, BFD Detect timer value == negotiated timer value, On timer expiry, initiate Session destroy to peer (restarting router). A.Palanivelan Expires Oct 12,2013 [Page 5] INTERNET DRAFT draft-palanivelan-bfd-gr-rec-00 Apr 10,2013 If Global gr flag is set, BFD Detect timer value == Highest of [the restart interval value (of the underlying protocol) or 360000000 microseconds]. On timer expiry, initiate Session destroy to peer(restarting router). When more than one GR enabled protocols (eg.,OSPFv2, BGP, LDP/RSVP, OSPFv3,..) are configured in the system (helper router) tied to BFD, the highest of the restart interval/timer values SHALL be considered. 5.3 Recommendation 3: BFD Detect timer value == 3 x Restart interval/timer value The local (helper) router, on receiving the restart notification from restarting peer router, sends a interrupt to bfd process, which would set a global flag and also set the timer to 3x, where x is the restart interval of the underlying protocol. When the restart is complete, the timer is set to the previously negotiated values and the BFD operation is normal. On receipt of the restart notification from the remote peer (Grace LSA as in OSPFv2), notify BFD which would set a Global gr Flag. On successful recovery after graceful restart or on expiry of the grace period, notify BFD which would reset the Global gr Flag. If Global gr flag not set, BFD Detect timer == negotiated timer value, On timer expiry, initiate Session destroy to peer (restarting router). If Global gr flag is set, BFD Detect timer == Highest of [ 3 x times the restart interval value (of the underlying protocol) or 360000000 microseconds]. On timer expiry, initiate Session destroy to peer(restarting router). When more than one GR enabled protocols (eg.,OSPFv2, BGP, LDP/RSVP, OSPFv3,..) are configured in the system (helper router) tied to BFD, the highest of the restart interval/timer values SHALL be considered. 5.4 Recommendation 4: BFD Detect timer value == predefined (Fixed High) value The local (helper) router, on receiving the restart notification from restarting peer router, sends a interrupt to bfd process, which would set a global flag and also set the timer to a predefined static value (like 360 seconds). When the restart is complete, the timer is set to the previously negotiated values and the BFD operation is normal. A.Palanivelan Expires Oct 12,2013 [Page 6] INTERNET DRAFT draft-palanivelan-bfd-gr-rec-00 Apr 10,2013 On receipt of the restart notification from the remote peer (Grace LSA as in OSPFv2), notify BFD which would set a Global gr Flag. On successful recovery after graceful restart or on expiry of the grace period, notify BFD which would reset the Global gr Flag. If Global gr flag not set, BFD Detect timer value == negotiated timer value, On timer expiry, initiate Session destroy to peer (restarting router). If Global gr flag is set, BFD Detect timer value == predefined static value of 360000000 microseconds. On timer expiry, initiate Session destroy to peer(restarting router). When more than one GR enabled protocols (eg.,OSPFv2, BGP, LDP/RSVP, OSPFv3,..) are configured in the system (helper router) tied to BFD, the highest of the restart interval/timer values SHALL be considered. 6 Security Considerations This memo does not raise any specific security issues. 7 IANA Considerations No IANA action is requested. 8 References 8.1 Normative References [BFD] Katz, D., and Ward, D., "Bidirectional Forwarding Detection",RFC 5880, June, 2010. [BFD-1HOP] Katz, D., and Ward, D., "BFD for IPv4 and IPv6 (Single Hop)", RFC 5881, June, 2010. [RFC5882] Katz, D. and Ward, D., "Bidirectional Forwarding Detection", RFC 5882, June 2010 8.2 Informative References [OSPF-GRACE] Moy, J., et al, "Graceful OSPF Restart", RFC 3623, November 2003. A.Palanivelan Expires Oct 12,2013 [Page 7] INTERNET DRAFT draft-palanivelan-bfd-gr-rec-00 Apr 10,2013 Authors' Addresses Palanivelan Appanasamy Principal Software Engineer, Networking & Storage Virtualization, EMC Corporation, Bangalore-560048, India. EMail: Palanivelan.Appanasamy@emc.com A.Palanivelan Expires Oct 12,2013 [Page 8]