Internet DRAFT - draft-mbind-bfd-redundancy

draft-mbind-bfd-redundancy







Internet Engineering Task Force                          M. Binderberger
Internet-Draft                                                  N. Akiya
Intended status: Standards Track                           Cisco Systems
Expires: November 08, 2013                                  May 07, 2013


                         Redundant BFD sessions
                     draft-mbind-bfd-redundancy-01

Abstract

   This document defines a second or "shadow" BFD session to an existing
   "primary" BFD session, providing resiliency against BFD failures that
   are not legitimate.

   Scenarios will be discussed on how presence of a shadow BFD session
   will be beneficial in the context of high availability.

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on November 08, 2013.

Copyright Notice

   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents



Binderberger & Akiya   Expires November 08, 2013                [Page 1]

Internet-Draft           Redundant BFD sessions                 May 2013


   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Failure scenarios . . . . . . . . . . . . . . . . . . . . . .   3
   3.  Differentiating primary and shadow sessions . . . . . . . . .   5
   4.  BFD version 2 packets . . . . . . . . . . . . . . . . . . . .   6
   5.  BFD discriminators  . . . . . . . . . . . . . . . . . . . . .   6
   6.  Using primary and shadow BFD sessions . . . . . . . . . . . .   6
   7.  LSP ping bootstrapped BFD sessions  . . . . . . . . . . . . .   7
   8.  Scale aspect  . . . . . . . . . . . . . . . . . . . . . . . .   8
   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
   10. Security Considerations . . . . . . . . . . . . . . . . . . .   8
   11. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   8
   12. Normative References  . . . . . . . . . . . . . . . . . . . .   8
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   9

1.  Introduction

   Bidirectional Forwarding Detection [RFC5880] is used to detect
   network failures.  Link failures and peer system outages are some
   examples of failures which can be detected with BFD technology.
   Although undesirable, the BFD technology may falsely declare failure
   in some scenarios: BFD process crash, FPGA reset on hardware based
   BFD, or a card running the BFD functionality fails or gets removed
   accidentally.  In all these cases, the forwarding being monitored by
   BFD may remain functional.  Unnecessary rerouting of traffic, while
   not a problem per-se, can be a problem at a large scale of false BFD
   triggers, e.g.  tens of thousands of traffic path.  A serious outcome
   may be seen if a network outage occurs in a time window in which BFD
   is not detecting failures.  For example, during software updates an
   extended timer value may be used, leaving the system and it's peer
   "blind" for any real liveliness problem until the BFD functionality
   is restored.

   This draft proposes to run a second "shadow" BFD session, in parallel
   to the existing "primary" BFD session.  This additional session will
   have it's own unique discriminator value(s).  The method used to
   differentiate discriminator zero primary and shadow sessions is
   discussed in the following sections.




Binderberger & Akiya   Expires November 08, 2013                [Page 2]

Internet-Draft           Redundant BFD sessions                 May 2013


2.  Failure scenarios

   BFD technology requires continuous transmission of control packets in
   both directions.  The rate at which both systems are required to
   transmit these packets will vary depending on operational
   requirements and configurations: BFD mode and interval.  If a BFD
   module on one system is unable to transmit BFD control packets for
   amount of time greater than the negotiated failure detection time,
   then the BFD module on the other system will declare a session
   failure.  Sometimes the cause of such a session failure is not
   related to the functionality of the path being monitored by BFD.

   Some failure scenarios which can exhibit such behaviors are described
   in this section.

   1.  Software based BFD: BFD process crash - Software entity handling
       BFD packets may crash unexpectedly.  Time it takes for same, or
       possibly alternative software entity, to become functional is a
       time window where BFD packets will not be handled.  If this time
       window is larger than negotiated failure detection time, sessions
       will be declared as failure even though monitored paths may still
       be valid.  If there existed another software entity, running on
       same CPU or different CPU, validating same paths, false failure
       can be avoided as long as two software entities do not crash
       around same time.

   2.  Software based BFD: CPU starvation - CPU starvation may cause BFD
       packets from being handled in timely manner.  During this period,
       packets may not get transmitted or received packets may not get
       processed.  If length of time CPU starvation affecting BFD
       software entity is larger than negotiated failure detection time,
       sessions will be declared as failure even though monitored paths
       may still be valid.  If there existed another software entity,
       running on different CPU, validating same paths, false failure
       for this scenario can be avoided as long as two software entities
       do not become CPU starved around same time.

   3.  Hardware based BFD: FPGA reset - In a scenario where hardware BFD
       and actual forwarding are performed on separate chips, it may be
       desirable to reset just FPGA which runs BFD.  Planned such FPGA
       reset can be handled locally.  Sessions can be migrated to
       another chip set, failure detection times can be extended during
       absence of local BFD functionality, combination of both or some
       other means.  However, any solution will require additional
       proprietary logics to be implemented.  Users, operating multiple
       products, may need to understand expected behavior of each.  In
       addition, extending failure detection times mean that system can
       no longer detect true failure within desired failure detection



Binderberger & Akiya   Expires November 08, 2013                [Page 3]

Internet-Draft           Redundant BFD sessions                 May 2013


       times.  A consistent solution which does not compromise
       configured failure detection time is desired.

   4.  System using centralized BFD architecture: Route processor card
       fault - A product with redundant route processor card could
       implement a standby BFD entity to run on the other route
       processor card.  Implementation may set BFD entity on standby
       route processor to be partially active or dormant until it is
       determined to be active.  In both cases, data synchronization
       between the two entities is essential to ensure standby "take
       over" happens seamlessly.  Additionally, "take over" detection
       and "take over" procedures themselves becoms essential, as any
       slowness in such may cause remote peers to take down sessions.
       If there existed two fully active BFD entities, one on active
       route processor and another on standby route processor,
       validating same paths, potentially complex "take over" logics can
       be avoided.

   5.  System using distributed BFD architecture: Linecard fault - BFD
       may run on logical interfaces which are comprised of physical
       interfaces spanning multiple linecards.  BFD may run on paths
       which are comprised of nexthops hosted on multiple linecards.
       BFD may run on logical interfaces or paths which nexthops change
       dynamically, jumping from one linecard to another.  In all cases,
       a linecard hosting a certain BFD session may not be hosting
       actual outgoing interface corresponding to that BFD session at
       any given time.  In such cases, failure of a linecard may not
       have any impact to the paths being monitored by some or all
       hosted BFD sessions.  One implementation may attempt to solve
       this problem by trying to move BFD sessions to a linecard where
       nexthops reside.  Unfortunately this only solves subset of the
       problem since it will not cover the scenario where there are
       valid multiple nexthops hosted on multiple linecards (ex: LAG,
       ECMP).  Another implementation may attempt to solve this problem
       by running a standby BFD entity on another linecard.  However,
       this solution has same issues as described in the centralized BFD
       architecture section.  Again, if there existed two fully active
       BFD entities, running on different linecards, validating same
       paths, potentially complex synchronization, "take over" or
       "migration" logics can be avoided.

   Failure scenarios are not limited to the ones described above.  In
   all cases, the reliability of BFD sessions will increase
   significantly if a second fully active BFD instance existed.  It is
   possible to address some, or potentially all, failure scenarios
   locally.  However, multiple proprietary solutions are likely required
   to cover wide problematic areas.  Result may not be desirable from
   operator perspective, as expected behavior will deviate from a



Binderberger & Akiya   Expires November 08, 2013                [Page 4]

Internet-Draft           Redundant BFD sessions                 May 2013


   failure to failure, and from a device to device.  Therefore, this
   specification defines a simple and consistent redundancy mechanism
   which can be used with wide range of local failure scenarios.

3.  Differentiating primary and shadow sessions

   For a single target monitored by BFD, a system needs to run two
   instances of the BFD sessions: a primary session and a shadow
   session.  This requires BFD control packets to have an indication on
   which role they belong.  In other words, every control packet needs
   to have an indication on whether it belongs to the primary or the
   shadow session.

   When looking at the BFD version 1 packet in [RFC5880], there are no
   unused bits left to store a shadow flag to distinguish the primary
   from the shadow session.  One could take away a bit from e.g.  the
   Diag, the Multiplier or the Length field, even claiming the least
   significant bit from one of the interval fields.  But none of these
   proposals would be safe against interoperability problems with BFD
   speakers not supporting this draft.

   That leaves three possible options.

   a.  Use of existing BFD version 1 control packet definition will
       indicate a primary BFD session.  Shadow BFD sessions will use
       version 2 in the BFD packets.  Besides usage of different version
       number, all operation will conform to the behaviors described in
       BFD RFCs.  Shadow BFD sessions only handle version 2 BFD packets.
       Primary BFD sessions only handle version 1 BFD packets as
       specified in section 6.8.6 of [RFC5880].

   b.  Define a new BFD packet header for version 2.  This new version
       is to include bits to indicate the session type: primary session
       or shadow session.  Shadow BFD sessions only handle version 2 BFD
       packets with shadow bits set.  Primary BFD sessions handle
       version 1 BFD packets or version 2 BFD packets with primary bits
       set.

   c.  Use information outside the BFD packet.  For IP/UDP encapsulated
       BFD packets this could be a UDP destination port different from
       the well-known ports defined in [RFC5881] and [RFC5883].  For BFD
       over Pseudo Wires [RFC5885] or BFD for MPLS-TP OAM [RFC6428] new
       type values could be used in the PW-ACH and G-ACH to
       differentiate shadow BFD packets from the primary BFD session
       packets.

   Option b redefines the BFD packet contents.  Although it is a clean
   solution, this approach can have a significant impact to existing BFD



Binderberger & Akiya   Expires November 08, 2013                [Page 5]

Internet-Draft           Redundant BFD sessions                 May 2013


   implementations.  Introduction of BFD redundancy capability at
   significant costs is thought to be undesirable, thus this option is
   not recommended.  However, when there is a discussion on defining new
   version of BFD packet contents, addition of redundancy capability
   would be recommended.  Option c will create dependencies with current
   and future BFD RFCs since each will need to define a way shadow
   session can be specified.  Therefore, this option is also not
   recommended.  That leaves option a as the recommended choice.

4.  BFD version 2 packets

   BFD version 2 packets follow exactly the definition given in
   [RFC5880] and other BFD-related RFCs, with one difference that the
   version field contains the value "2".  The packet format is the same
   as described in section 4.1 of [RFC5880].  Implementations following
   this draft MUST be able to receive BFD packets with the version field
   values "1" and "2" and MUST drop BFD packets with any other version
   value.

   BFD packets with a version value of "1" are named "primary" packets
   while BFD packets with a version value of "2" are named "shadow"
   packets within this document.  The primary session MUST only transmit
   and receive primary packets.  The shadow session MUST only transmit
   and receive shadow packets.

5.  BFD discriminators

   As primary sessions and shadow sessions are operating independently,
   they have different my discriminator values.  My discriminator values
   assigned to BFD sessions are unique per system, across the combined
   set of primary and shadow sessions.  In other words, a system will
   have one discriminator pool to be used for both primary and shadow
   sessions, not a pool per session type.

6.  Using primary and shadow BFD sessions

   A shadow BFD session is associated to exactly one primary BFD
   session.  The parameters used by shadow sessions SHOULD be the same
   as the parameters of associated primary session.  Purpose for such is
   to ensure that two sessions operate using the same mode, interval and
   failure detection time.  This allows for the two sessions to behave
   as similar as possible to reduce the chance of them concluding
   deviating state in valid failure scenarios.

   When the BFD shadow capability is enabled to a target, two session
   instances to that target are created: primary and shadow.  A logic
   SHOULD be applied to identify where in the system to host the two
   sessions.  The logic should maximize the failure detection validity



Binderberger & Akiya   Expires November 08, 2013                [Page 6]

Internet-Draft           Redundant BFD sessions                 May 2013


   by minimizing the chances of both sessions being impacted by a single
   local failure.  For example, if there are multiple CPU instances,
   there will be more benefits to run the two sessions on different CPU
   instances.  Details of this logic, however, is outside the scope of
   this document.

   Both the primary and the shadow session are to operate as per
   specified in other BFD RFCs.  A differentiator comes into play
   between state changes of the two sessions and the action taken when
   reachability of the BFD enabled target changes.  This differentiator
   will be referred as the state consolidation module from here onward.
   The purpose of the state consolidation module is to consolidate the
   state of the primary and the shadow session, and to produce a final
   state to be used by the system to take action on.  The logic of the
   state consolidation module is as follows:

   Final state is UP when the state of the primary session is UP or the
   state of the shadow session is UP.

   Final state is DOWN when both the state primary session is DOWN and
   the state of the shadow sessions is DOWN.

7.  LSP ping bootstrapped BFD sessions

   This specification aims to introduce BFD redundancy concept to
   various flavors of BFD while minimizing disruption to existing
   implementations.  There is, however, one additional change required
   in order to support LSP ping bootstrapped BFD sessions described by
   [RFC5884].

   This specification defines a new optional TLV to be carried in LSP
   ping packet.

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                        Discriminator                          |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   This TLV has a length of 4.  The value contains the 4-byte local
   discriminator that the LSR, sending the LSP ping message, associates
   with the shadow BFD session.  TBD: IANA to assign optional type.

   Upon reception of this optional TLV, LSP egress is to create a shadow
   session for specified FEC, if local constraints allow, with your
   discriminator set to value specified in the TLV.  This TLV MAY be
   included in the LSP ping which carries BFD discriminator TLV of



Binderberger & Akiya   Expires November 08, 2013                [Page 7]

Internet-Draft           Redundant BFD sessions                 May 2013


   corresponding primary session, or this TLV MAY be carried in a
   separate LSP ping packet which does not carry BFD discriminator TLV
   of corresponding primary session.  In both cases, egress LSR MUST
   associate both primary and shadow sessions in the state consolidation
   module.

8.  Scale aspect

   The BFD module becomes more resilient by enabling the shadow BFD
   capability.  However, when the shadow BFD capability is enabled on a
   system, the total number of BFD sessions hosted on a system will be
   increased by the number of shadow BFD sessions.  For the same number
   of BFD monitored targets, more system resources will be used.
   Solving a scale issue is outside the scope of this document.
   However, below lists some techniques which can be considered:

   1.  Reduce the configured BFD intervals of some or all BFD sessions.

   2.  Allow an implementation to run shadow sessions at a slower rate.

9.  IANA Considerations

   IANA to assign optional type for new LSP ping TLV.

10.  Security Considerations

   This document does not introduce any additional security issues and
   the security mechanisms defined in [RFC5880] apply in this document.

11.  Acknowledgements

   Authors would like to thank Aswatnarayan Raghuram from AT&T for
   providing requirements and helpful comments.

   Authors would like to thank Gregory Mirsky and Alexander Vainshtein
   for providing insightful comments.

   Authors would like to thank Srihari Raghavan and Mallik Mudigonda
   from Cisco Systems for providing valuable comments regarding LSP ping
   bootstrapped sessions.

12.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC5880]  Katz, D. and D. Ward, "Bidirectional Forwarding Detection
              (BFD)", RFC 5880, June 2010.



Binderberger & Akiya   Expires November 08, 2013                [Page 8]

Internet-Draft           Redundant BFD sessions                 May 2013


   [RFC5881]  Katz, D. and D. Ward, "Bidirectional Forwarding Detection
              (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, June
              2010.

   [RFC5883]  Katz, D. and D. Ward, "Bidirectional Forwarding Detection
              (BFD) for Multihop Paths", RFC 5883, June 2010.

   [RFC5884]  Aggarwal, R., Kompella, K., Nadeau, T., and G. Swallow,
              "Bidirectional Forwarding Detection (BFD) for MPLS Label
              Switched Paths (LSPs)", RFC 5884, June 2010.

   [RFC5885]  Nadeau, T. and C. Pignataro, "Bidirectional Forwarding
              Detection (BFD) for the Pseudowire Virtual Circuit
              Connectivity Verification (VCCV)", RFC 5885, June 2010.

   [RFC6428]  Allan, D., Swallow Ed. , G., and J. Drake Ed. , "Proactive
              Connectivity Verification, Continuity Check, and Remote
              Defect Indication for the MPLS Transport Profile", RFC
              6428, November 2011.

Authors' Addresses

   Marc Binderberger
   Cisco Systems

   Email: mbinderb@cisco.com


   Nobo Akiya
   Cisco Systems

   Email: nobo@cisco.com


















Binderberger & Akiya   Expires November 08, 2013                [Page 9]