Internet Engineering Task Force M. Binderberger Internet-Draft N. Akiya Intended status: Standards Track Cisco Systems Expires: May 31, 2013 November 27, 2012 Redundant BFD sessions draft-mbind-bfd-redundancy-00 Abstract This document defines a second or "shadow" BFD session to an existing "primary" BFD session, providing resilience against BFD failures that are not legitimate. Scenarios will be discussed on how presence of a shadow BFD session will be beneficial in the context of high availability. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on May 31, 2013. Copyright Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents Binderberger & Akiya Expires May 31, 2013 [Page 1] Internet-Draft Redundant BFD sessions November 2012 (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Failure scenarios . . . . . . . . . . . . . . . . . . . . . . . 3 3. Differentiating primary and shadow sessions . . . . . . . . . . 4 4. BFD version 2 packets . . . . . . . . . . . . . . . . . . . . . 5 5. BFD discriminators . . . . . . . . . . . . . . . . . . . . . . 5 6. Using primary and shadow BFD sessions . . . . . . . . . . . . . 5 7. Scale aspect . . . . . . . . . . . . . . . . . . . . . . . . . 6 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 9. Security Considerations . . . . . . . . . . . . . . . . . . . . 6 10. Normative References . . . . . . . . . . . . . . . . . . . . . 7 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 7 Binderberger & Akiya Expires May 31, 2013 [Page 2] Internet-Draft Redundant BFD sessions November 2012 1. Introduction Bidirectional Forwarding Detection [RFC5880] is used to detect network failures. Link failures and peer system outages are some examples of failures which can be detected with BFD technology. Although undesirable, the BFD technology may falsely declare failure in some scenarios: BFD process crash, FPGA reset on hardware based BFD, or a card running the BFD functionality fails or gets removed accidentally. In all these cases, the forwarding being monitored by BFD may remain functional. Unnecessary rerouting of traffic, while not a problem per-se, can be a problem at a large scale of false BFD triggers, e.g. tens of thousands of traffic path. A serious outcome may be seen if a network outage occurs in a time window in which BFD is not detecting failures. For example, during software updates an extended timer value may be used, leaving the system and it's peer "blind" for any real liveliness problem until the BFD functionality is restored. This draft proposes to run a second "shadow" BFD session, in parallel to the existing "primary" BFD session. This additional session will have it's own unique discriminator value(s). The method used to differentiate discriminator zero primary and shadow sessions is discussed in the following sections. 2. Failure scenarios BFD technology requires continuous transmission of control packets in both directions. The rate at which both systems are required to transmit these packets will vary depending on operational requirements and configurations: BFD mode and interval. If a BFD module on one system is unable to transmit BFD control packets for amount of time greater than the negotiated failure detection time, then the BFD module on the other system will declare a session failure. Sometimes the cause of such a session failure is not related to the functionality of the path being monitored by BFD. Some failure scenarios which can exhibit such behaviors are described in this section. 1. Software based BFD: BFD process crash, CPU starvation. 2. Hardware based BFD: FPGA reset. 3. System using centralized BFD architecture: Route processor card fault. Binderberger & Akiya Expires May 31, 2013 [Page 3] Internet-Draft Redundant BFD sessions November 2012 4. System using distributed BFD architecture: Linecard fault. Failure scenarios are not limited to the ones described above. In all cases, the reliability of BFD sessions will increase significantly if a second BFD instance exists. 3. Differentiating primary and shadow sessions For a single target monitored by BFD, a system needs to run two instances of the BFD sessions: a primary session and a shadow session. This requires BFD control packets to have an indication on which role they belong. In other words, every control packet needs to have an indication on whether it belongs to the primary or the shadow session. When looking at the BFD version 1 packet in [RFC5880], there are no unused bits left to store a shadow flag to distinguish the primary from the shadow session. One could take away a bit from e.g. the Diag, the Multiplier or the Length field, even claiming the least significant bit from one of the interval fields. But none of these proposals would be safe against interoperability problems with BFD speakers not supporting this draft. That leaves three possible options. a. Use of existing BFD version 1 control packet definition will indicate a primary BFD session. Shadow BFD sessions will use version 2 in the BFD packets. Besides usage of different version number, all operation will conform to the behaviors described in BFD RFCs. Shadow BFD sessions only handle version 2 BFD packets. Primary BFD sessions only handle version 1 BFD packets as specified in section 6.8.6 of [RFC5880]. b. Define a new BFD packet header for version 2. This new version is to include bits to indicate the session type: primary session or shadow session. Shadow BFD sessions only handle version 2 BFD packets with shadow bits set. Primary BFD sessions handle version 1 BFD packets or version 2 BFD packets with primary bits set. c. Use information outside the BFD packet. For IP/UDP encapsulated BFD packets this could be a UDP destination port different from the well-known ports defined in [RFC5881] and [RFC5883]. For BFD over Pseudo Wires [RFC5885] or BFD for MPLS-TP OAM [RFC6428] new type values could be used in the PW-ACH and G-ACH to differentiate shadow BFD packets from the primary BFD session packets. Binderberger & Akiya Expires May 31, 2013 [Page 4] Internet-Draft Redundant BFD sessions November 2012 Option b redefines the BFD packet contents. Although it is a clean solution, this approach can have a significant impact to existing BFD implementations. Introduction of BFD redundancy capability at significant costs is thought to be undesirable, thus this option is not recommended. However, when there is a discussion on defining new version of BFD packet contents, addition of redundancy capability would be recommended. Option c will create dependencies with current and future BFD RFCs since each will need to define a way shadow session can be specified. Therefore, this option is also not recommended. That leaves option a as the recommended choice. 4. BFD version 2 packets BFD version 2 packets follow exactly the definition given in [RFC5880] and other BFD-related RFCs, with one difference that the version field contains the value "2". The packet format is the same as described in section 4.1 of [RFC5880]. Implementations following this draft MUST be able to receive BFD packets with the version field values "1" and "2" and MUST drop BFD packets with any other version value. BFD packets with a version value of "1" are named "primary" packets while BFD packets with a version value of "2" are named "shadow" packets within this document. The primary session MUST only transmit and receive primary packets. The shadow session MUST only transmit and receive shadow packets. 5. BFD discriminators As primary sessions and shadow sessions are operating independently, they have different my discriminator values. My discriminator values assigned to BFD sessions are unique per system, across the combined set of primary and shadow sessions. In other words, a system will have one discriminator pool to be used for both primary and shadow sessions, not a pool per session type. 6. Using primary and shadow BFD sessions A shadow BFD session is associated to exactly one primary BFD session. The parameters used by shadow sessions SHOULD be the same as the parameters of associated primary session. Purpose for such is to ensure that two sessions operate using the same mode, interval and failure detection time. This allows for the two sessions to behave as similar as possible to reduce the chance of them concluding deviating state in valid failure scenarios. Binderberger & Akiya Expires May 31, 2013 [Page 5] Internet-Draft Redundant BFD sessions November 2012 When the BFD shadow capability is enabled to a target, two session instances to that target are created: primary and shadow. A logic SHOULD be applied to identify where in the system to host the two sessions. The logic should maximize the failure detection validity by minimizing the chances of both sessions being impacted by a single local failure. Details of this logic is outside the scope of this document. Both the primary and the shadow session are to operate as per specified in other BFD RFCs. A differentiator comes into play between state changes of the two sessions and the action taken when reachability of the BFD enabled target changes. This differentiator will be referred as the state consolidation module from here onward. The purpose of the state consolidation module is to consolidate the state of the primary and the shadow session, and to produce a final state to be used by the system to take action on. The logic of the state consolidation module is as follows: Final state is UP when the state of the primary session is UP or the state of the shadow session is UP. Final state is DOWN when both the state primary session is DOWN and the state of the shadow sessions is DOWN. 7. Scale aspect The BFD module becomes more resilient by enabling the shadow BFD capability. However, when the shadow BFD capability is enabled on a system, the total number of BFD sessions hosted on a system will be increased by the number of shadow BFD sessions. For the same number of BFD monitored targets, more system resources will be used. Solving a scale issue is outside the scope of this document. However, below lists some techniques which can be considered: 1. Reduce the configured BFD intervals of some or all BFD sessions. 2. Allow an implementation to run shadow sessions at a slower rate. 8. IANA Considerations No work requested from IANA. 9. Security Considerations This document does not introduce any additional security issues and Binderberger & Akiya Expires May 31, 2013 [Page 6] Internet-Draft Redundant BFD sessions November 2012 the security mechanisms defined in [RFC5880] apply in this document. 10. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection (BFD)", RFC 5880, June 2010. [RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, June 2010. [RFC5883] Katz, D. and D. Ward, "Bidirectional Forwarding Detection (BFD) for Multihop Paths", RFC 5883, June 2010. [RFC5885] Nadeau, T. and C. Pignataro, "Bidirectional Forwarding Detection (BFD) for the Pseudowire Virtual Circuit Connectivity Verification (VCCV)", RFC 5885, June 2010. [RFC6428] Allan, D., Swallow Ed. , G., and J. Drake Ed. , "Proactive Connectivity Verification, Continuity Check, and Remote Defect Indication for the MPLS Transport Profile", RFC 6428, November 2011. Authors' Addresses Marc Binderberger Cisco Systems Email: mbinderb@cisco.com Nobo Akiya Cisco Systems Email: nobo@cisco.com Binderberger & Akiya Expires May 31, 2013 [Page 7]