Internet DRAFT - draft-chroboczek-babel-doesnt-care

draft-chroboczek-babel-doesnt-care







Network Working Group                                      J. Chroboczek
Internet-Draft                          PPS, University of Paris-Diderot
Intended status: Informational                             April 4, 2015
Expires: October 6, 2015


                           Babel doesn't care
                 draft-chroboczek-babel-doesnt-care-00

Abstract

   Where we claim that, unlike typical IGPs, the Babel routing protocol
   just doesn't care.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on October 6, 2015.

Copyright Notice

   Copyright (c) 2015 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.






Chroboczek               Expires October 6, 2015                [Page 1]

Internet-Draft             Babel doesn't care                 April 2015


Table of Contents

   1.  Introduction: why Babel doesn't care  . . . . . . . . . . . .   2
     1.1.  Consequences and applicability of the protocol  . . . . .   3
   2.  Structure of the Babel routing protocol . . . . . . . . . . .   3
     2.1.  Distance-vector protocol  . . . . . . . . . . . . . . . .   4
     2.2.  Link sensing  . . . . . . . . . . . . . . . . . . . . . .   4
     2.3.  Metric agnosticity  . . . . . . . . . . . . . . . . . . .   5
     2.4.  Flexible route selection policies . . . . . . . . . . . .   5
     2.5.  Loop avoidance  . . . . . . . . . . . . . . . . . . . . .   5
     2.6.  Blackhole elimination . . . . . . . . . . . . . . . . . .   6
   3.  Examples of how Babel doesn't care  . . . . . . . . . . . . .   6
     3.1.  Lossy networks  . . . . . . . . . . . . . . . . . . . . .   6
     3.2.  Non-transitive (mesh) networks  . . . . . . . . . . . . .   7
     3.3.  Non-default route selection policies  . . . . . . . . . .   7
     3.4.  Non-distributive metrics  . . . . . . . . . . . . . . . .   7
     3.5.  Delay-based routing . . . . . . . . . . . . . . . . . . .   8
     3.6.  Source-specific routing . . . . . . . . . . . . . . . . .   9
     3.7.  Interoperability  . . . . . . . . . . . . . . . . . . . .   9
   4.  Conclusion  . . . . . . . . . . . . . . . . . . . . . . . . .   9
   5.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  10
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  11

1.  Introduction: why Babel doesn't care

   The core of the Babel routing protocol [RFC6126] consists of a
   distance vector protocol based on a distributed variant of the
   Bellman-Ford algorithm combined with a loop avoidance and a blackhole
   elimination mechanism.  As long as these mechanisms are left intact,
   Babel will push packets in roughly the right direction following a
   loop-free path.

   Babel doesn't care what happens in the network, in the metric, or in
   the route selection policy, or whether routing information is
   propagated in a timely manner through the network.  In particular,
   Babel has been shown to work on unstable networks, on wireless
   networks, on mesh networks, on hybrid networks (networks composed of
   an unstable meshy bit combined with a stable wired infrastructure),
   on overlay networks, and has been extended with source-specific
   routing.  Some of these extensions involved fundamental changes to
   the structure of metrics or even the forwarding semantics, but, to a
   great extent, Babel just didn't care.

   This is in contrast with familiar link-state IGPs, such as OSPF and
   IS-IS.  These link-state protocols deeply care that all nodes in a
   given area have their link state databases synchronised in a timely
   manner, lest routing loops and other pathologies occur.  OSPF and IS-
   IS therefore require careful (see?) implementation of timely and



Chroboczek               Expires October 6, 2015                [Page 2]

Internet-Draft             Babel doesn't care                 April 2015


   reliable flooding, do not support dynamically computed metrics
   without careful attention to stability issues, do not allow any form
   of filtering or non-default route selection except at area
   boundaries, and require a distributive metric.

1.1.  Consequences and applicability of the protocol

   Babel doesn't care where the routing information comes from --
   whatever information it is given, it maintains a set of loop-free
   paths.  As we show in Section 3, this has allowed Babel to either
   work out of the box or be easily adapted to a wide range of
   situations that are difficult or outright impossible to solve with
   the familiar link-state routing protocols.

   This enormous flexibility of the protocol makes Babel particularly
   adapted to tricky networks where the features of familiar routing
   protocols (shortest-path next-hop routing over transitive links with
   statically defined metrics) are not good enough.  This can include
   mesh networks, overlay networks, radio networks, etc.  Since all of
   these kinds of networks are handled by a single protocol, Babel is
   particularly applicable to hybrid networks, whose different parts
   need different algorithms, for example networks composed of a meshy
   part and a stable, wired part.

   Babel is not necessarily the best choice in large, stable, carefully
   administered networks, where a more rigid protocol such as OSPF or
   IS-IS will have lower overhead and be easier to debug.

2.  Structure of the Babel routing protocol

   The Babel routing protocol [RFC6126] is a loop-avoiding distance
   vector protocol.  The core Babel protocol is reasonably simple (it is
   described in 30 pages in RFC 6126 and has been reimplemented from
   scratch in slightly over 20 hours by Markus Stenberg), and consists
   of the following components:

   o  a distance vector routing protocol that is metric-agnostic and
      supports flexible routing policies;

   o  explicit mechanisms (Hello/IHU) for link sensing and bidirectional
      reachability detection;

   o  a loop-avoidance mechanism that makes strong guarantees even in
      transient state;

   o  a provably complete blackhole avoidance mechanism.





Chroboczek               Expires October 6, 2015                [Page 3]

Internet-Draft             Babel doesn't care                 April 2015


   In the rest of this section, we describe these mechanisms in more
   detail.

2.1.  Distance-vector protocol

   At the core of Babel lies a distance-vector routing protocol, based
   on a distributed variant of the Bellman-Ford algorithm, similiar to
   the core of BGP or EIGRP.

   Distance-vector IGPs have become unfashionable sometime in the 1990s.
   However, distance vector has a number of very pleasant properties: it
   is simple to implement, reasonably robust in the presence of
   implementation mistakes, and allows the use of redundant routing
   tables (routing tables with multiple routes to a single destination),
   which makes reconvergence after a link failure almost instantaneous
   (no packet exchanges) when the alternate routes are already
   available.

   Another important property is that, unlike link-state, distance
   vector builds its routes consistently with the direction of
   forwarding in next-hop routing.  This makes it robust in the presence
   of routing inconsistencies, a property on which Babel relies when it
   delays sending updates or uses a non-distributive metric (see
   Section 3.4 below).

2.2.  Link sensing

   Like most modern routing protocols, Babel has a simple and
   lightweight mechanism for detecting neighbours and eliminating
   assymetric neighbours.  Every node sends periodic Hello TLVs;
   somewhat less often, every node sends IHU ("I Heard You") TLVs that
   contain the list of all the nodes from which it has heard a Hello.

   Hello TLVs carry a sequence number (not to be confused with the
   sequence number used for loop avoidance).  Additionally, both Hello
   and IHU TLVs carry an explicit interval -- an upper bound on the time
   after which a new Hello or IHU can be expected.  This allows nodes
   with different parameters to reliably establish neighbour
   relationships, since the hold time for a neighbour can be determined
   dynamically from the explicit interval.  Additionally, the Hello
   seqno allows a node to vary its Hello to send unscheduled Hellos
   without confusing its neigbours, for example after detecting a
   mobility event.

   Both Hello and IHU TLVs are extensible.  For example, the delay-based
   routing extension [BABEL-RTT] uses Hellos and IHUs to carry
   timestamps and timestamp echoes respectively.




Chroboczek               Expires October 6, 2015                [Page 4]

Internet-Draft             Babel doesn't care                 April 2015


2.3.  Metric agnosticity

   Distance vector's robustness and the extensibility of Babel's packet
   format conspire to make Babel metric-agnostic: a metric in Babel
   doesn't need to be an integer, it can be any sort of algebra that is
   equipped with a strictly-monotonic ordering (Section 3.5.2 of
   [RFC6126]).  In particular, a Babel metric need not be distributive
   -- a property that is used by Babel's radio interference extension
   [BABEL-Z].  More about this in Section 3.4 below.

   What is more, the loop-avoidance features of Babel ensure that a
   packet is pushed in roughly the right direction, following a loop-
   free path, even when the metric is extremely unstable.  This property
   is what makes delay-based routing work reliably even in the presence
   of congestion.

2.4.  Flexible route selection policies

   Distance vector is flexible, in the sense that routes can be dropped
   ("filtered") out at any point in the network, and that a node can
   that is unwilling to forward packets may advertise an artificially
   inflated metric ("prepending").  This is very different from link
   state, where filtering can only be done at area boundaries.  The
   reference implementation of Babel includes a powerful language for
   filtering and prepending, and this feature has proven to be extremely
   useful to work around flaws in the data plane (e.g. for discarding
   NATed or firewalled routes).

   What is more, route selection need not choose the route with smallest
   metric -- any route with finite metric will do, as long as the choice
   is consistent with the data plane.  This property is used by Babel to
   improve stability by applying hysteresis during route selection.

2.5.  Loop avoidance

   Like EIGRP and BGP, Babel is a loop-avoiding protocol: only those
   routes are accepted that can be proven to be loop-free.  A route that
   has this property is said to be "feasible".

   Babel's feasibility condition is based on the one used in EIGRP
   [DUAL], but is structured similarly to DSDV [DSDV].  Thus, Babel
   avoids EIGRP's active phase (not needed with DSDV-style feasibility)
   while also avoiding DSDV's issues with retractions (the even-odd
   mechanism of DSDV is not necessary with EIGRP-style feasibility) and
   minor metric fluctuations (EIGRP-style feasibility can cushion up to
   one hop of fluctuation).





Chroboczek               Expires October 6, 2015                [Page 5]

Internet-Draft             Babel doesn't care                 April 2015


   Babel's loop-freedom properties are well understood (Section 1.1 of
   [RFC6126]) and have been proven (the proof has not been published
   yet).

2.6.  Blackhole elimination

   It is well known that DSDV-style loop avoidance, as used in Babel,
   has an issue: it may sometimes cause spurious blackholes.  This
   situation is known as "starvation" in Babel (Section 2.5 of
   [RFC6126]).

   DSDV attempted to work around the issue by updating a route's seqno
   periodically, which made DSDV unsuitable for high-mobility scenarios:
   after moving a DSDV laptop to a new place, the user would need to
   wait for up to 30 seconds for a new seqno to be generated and
   propagate across the network.  What is more, the need to propagate
   new seqnos in a timely manner put an upper bound on DSDV's update
   interval.

   Babel uses a different mechanism for clearing spurious blackholes,
   which is based on the observation that it should be the starving node
   that triggers a seqno update.  When a node detects that it is
   suffering from starvation (all the routes to a given destination are
   unfeasible), it sends an end-to-end "seqno request" that requests
   that the source increase its seqno number.  Since seqno requests
   follow routes that have not been proven to be loop-free, they are
   equipped with a hop count; combined with duplicate request detection,
   this mechanism is complete and reasonably efficient.

3.  Examples of how Babel doesn't care

   In the rest of this document, we describe a number of situations in
   which Babel just doesn't care.

3.1.  Lossy networks

   Babel was designed to support lossy networks well, and one of the
   metrics suggested in Appendix A of [RFC6126] depends on packet loss.
   This metric tends to fluctuate in normal usage, but two features of
   Babel conspire to make it work well: Babel's feasibility function is
   able to keep a route feasible when the metric doesn't fluctuate by
   more than the cost of one hop, and Babel's loop avoidance ensures
   that packets are pushed in the right direction even when the metric
   fluctuates.

   The metric may fluctuate -- but Babel doesn't care.





Chroboczek               Expires October 6, 2015                [Page 6]

Internet-Draft             Babel doesn't care                 April 2015


3.2.  Non-transitive (mesh) networks

   A non-transitive (or mesh) network is one in which there is no well-
   defined notion of link -- where the neighbour relationship between
   interfaces is not transitive.  Routing protocols such as OSPF and IS-
   IS require a transitive network.  Babel just doesn't care.

   (In fact, it does care a little -- the split horizon optimisation can
   be enabled on a transitive, lossless link.)

3.3.  Non-default route selection policies

   Babel doesn't require that a router always pick the route with
   smallest metric.  The route selection procedure uses this property to
   prefer stable routes (routes that have been around for a while) to
   potentially unstable ones (routes that have popped up recently), and
   could potentially take into account information that has been
   obtained from other sources than the protocol (say, battery charge
   information).

   Choosing a path other than the shortest one would break a link-state
   protocol.  Babel doesn't care.

3.4.  Non-distributive metrics

   In most routing protocols, metrics consist of some set of bounded
   integers equipped with the usual addition.  But there is no
   requirement that the metric have such a simple structure.  Babel will
   work with any metric that is strictly monotonic on the left:

   m < n.m

   Link-state protocols make one further requirement on the metric,
   called left distributivity or isotonicity, which is not required by
   distance vector protocols:

   m <= m' ---> n.m <= n.m'

   To see how a non-distributive metric can arise, consider Babel's
   radio interference extension [BABEL-Z], which aims to minimise the
   number of times a route crosses a given frequency.  Suppose that A is
   a single radio router, while routers B and C have two radios; suppose
   further that the dashed frequency has less packet loss than the
   dotted frequency:







Chroboczek               Expires October 6, 2015                [Page 7]

Internet-Draft             Babel doesn't care                 April 2015


               q
       p     -----
   A ----- B ..... C
               q'

   Then the preferred route from A to C is p.q', while the preferred
   route from B to C is q.  The metric is non-distributive:

   q <= q' but p.q > p.q'

   (A note for BGP specialists: this is analogous to what happens in BGP
   when q is a customer route, and is therefore preferred by B, while q'
   is a shorter peer route, and would therefore be preferable for A.
   Many familiar BGP policies can be modelled by non-distributive
   metrics.)

   Using a non-distributive metric is impossible in link-state
   protocols, where it may lead to persistent routing loops.  Babel
   doesn't care, and converges to a Nash equilibrium.  (This assertion
   is currently unproven, but similar properties are known to hold for
   Bellman-Ford.)

3.5.  Delay-based routing

   Overlay networks -- networks built over tunnels or VPNs -- present a
   complex network topology as a single hop.  Efficient routing in such
   networks requires distinguishing between local and remote hops, which
   is usually done by a human administrator manually tweaking the
   metric.

   The delay-based extension to Babel [BABEL-RTT] [DELAY-BASED] uses RTT
   measurements to automatically prefer local routes.  Such an approach
   leads to a negative feedback loop between the control plane and the
   data plane, which naturally leads to routing oscillations (a route is
   preferred when it has low RTT, which causes traffic to flow through
   it, which in turn causes its RTT to increase, which causes a
   different route to be preferred).  In a link-state protocol, such
   oscillations would lead to repeated refloodings and SPF
   recomputation.  Babel, on the other hand, doesn't care: even in the
   presence of oscillations, it is pushing packets in the right
   direction following loop-free paths.

   (Oscillations cause packet reordering, which is something the
   transport layer cares about, so while Babel itself might not care, we
   have equipped our implementation with a number of mechanisms to limit
   the amount of oscillation.)





Chroboczek               Expires October 6, 2015                [Page 8]

Internet-Draft             Babel doesn't care                 April 2015


3.6.  Source-specific routing

   On advice from the Homement working group, we have extended Babel
   with support for source-specific routing [BABEL-SS] [SS-ROUTING].
   This is a sizeable extension: it changes the behaviour of the data
   plane (packets are forwarded according to both source and
   destination), it changes the notion of most-specific route, and it
   requires no less than three new TLVs to be carried by the protocol.

   In this particular case, Babel does care, to a certain extent.  The
   new TLVs needed to be very carefully encoded so that even in the
   presence of non-source-specific routers, the data plane remains
   consistent with the control plane, and we needed to make some efforts
   to ensure that the blackhole elimination mechanism wouldn't be broken
   by source-specific routing.  From a different point of view, however,
   Babel still doesn't care: the loop avoidance mechanism applied with
   no changes to the new kind of routes.

3.7.  Interoperability

   All of the extensions to Babel described above interoperate, to the
   extent possible.  For example, a delay-based router will reliably
   exchange routes with a non-delay-based one, but will not perform any
   RTT estimation.  Similarly, a source-specific router will reliably
   exchange non-specific routes with a non-source-specific router.

   While achieving this requires careful encoding of the extension data
   so that just the right bits are discarded by routers that don't
   understand the extension (Section 4 of [BABEL-EXT]) and checking that
   the forwarding behaviour remains consistent with the routing data,
   Babel itself doesn't care -- it just computes paths and applies the
   loop avoidance mechanism, whatever the source of the routing
   information.

4.  Conclusion

   We have solved within the Babel routing protocol a number of
   interesting routing problems that are difficult or outright
   impossible to solve in other routing protocols, and we have done that
   without breaking interoperability between the various variants of the
   protocol.  We claim that this makes Babel particularly suitable for
   the kind of messy networks where such problems arise, and in
   particular to hybrid networks, where different parts of the network
   require different mechanisms to work well but don't necessarily wish
   to run multiple routing protocols.

   But that's us.  Babel doesn't care.




Chroboczek               Expires October 6, 2015                [Page 9]

Internet-Draft             Babel doesn't care                 April 2015


5.  References

   [BABEL-EXT]
              Chroboczek, J., "Extension Mechanism for the Babel Routing
              Protocol", draft-chroboczek-babel-extension-mechanism-04
              (work in progress), March 2015.

   [BABEL-RTT]
              Jonglez, B. and J. Chroboczek, "Delay-based Metric
              Extension for the Babel Routing Protocol", Internet Draft
              draft-jonglez-babel-rtt-extension-00, July 2014.

   [BABEL-SS]
              Boutier, M. and J. Chroboczek, "Source-Specific Routing in
              Babel", draft-boutier-babel-source-specific-00 (work in
              progress), November 2014.

   [BABEL-Z]  Chroboczek, J., "Diversity Routing for the Babel Routing
              Protocol", draft-chroboczek-babel-diversity-routing-00
              (work in progress), July 2014.

   [DELAY-BASED]
              Jonglez, B., Boutier, M., and J. Chroboczek, "A delay-
              based routing metric", March 2015,
              <http://arxiv.org/abs/1403.3488>.

              Submitted for publication.

   [DSDV]     Perkins, C. and P. Bhagwat, "Highly Dynamic Destination-
              Sequenced Distance-Vector Routing (DSDV) for Mobile
              Computers", ACM SIGCOMM'94 Conference on Communications
              Architectures, Protocols and Applications 234-244, 1994.

   [DUAL]     Garcia Luna Aceves, J., "Loop-Free Routing Using Diffusing
              Computations", IEEE/ACM Transactions on Networking 1:1,
              February 1993.

   [RFC6126]  Chroboczek, J., "The Babel Routing Protocol", RFC 6126,
              February 2011.

   [SS-ROUTING]
              Boutier, M. and J. Chroboczek, "Source-sensitive routing",
              March 2015, <http://arxiv.org/abs/1403.0445>.

              To appear in Proc. IFIP Networking 2015.






Chroboczek               Expires October 6, 2015               [Page 10]

Internet-Draft             Babel doesn't care                 April 2015


Author's Address

   Juliusz Chroboczek
   PPS, University of Paris-Diderot
   Case 7014
   75205 Paris Cedex 13
   France

   Email: jch@pps.univ-paris-diderot.fr










































Chroboczek               Expires October 6, 2015               [Page 11]