Network Working Group J. Chroboczek Internet-Draft PPS, University of Paris-Diderot Intended status: Informational April 4, 2015 Expires: October 6, 2015 Babel doesn't care draft-chroboczek-babel-doesnt-care-00 Abstract Where we claim that, unlike typical IGPs, the Babel routing protocol just doesn't care. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on October 6, 2015. Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Chroboczek Expires October 6, 2015 [Page 1] Internet-Draft Babel doesn't care April 2015 Table of Contents 1. Introduction: why Babel doesn't care . . . . . . . . . . . . 2 1.1. Consequences and applicability of the protocol . . . . . 3 2. Structure of the Babel routing protocol . . . . . . . . . . . 3 2.1. Distance-vector protocol . . . . . . . . . . . . . . . . 4 2.2. Link sensing . . . . . . . . . . . . . . . . . . . . . . 4 2.3. Metric agnosticity . . . . . . . . . . . . . . . . . . . 5 2.4. Flexible route selection policies . . . . . . . . . . . . 5 2.5. Loop avoidance . . . . . . . . . . . . . . . . . . . . . 5 2.6. Blackhole elimination . . . . . . . . . . . . . . . . . . 6 3. Examples of how Babel doesn't care . . . . . . . . . . . . . 6 3.1. Lossy networks . . . . . . . . . . . . . . . . . . . . . 6 3.2. Non-transitive (mesh) networks . . . . . . . . . . . . . 7 3.3. Non-default route selection policies . . . . . . . . . . 7 3.4. Non-distributive metrics . . . . . . . . . . . . . . . . 7 3.5. Delay-based routing . . . . . . . . . . . . . . . . . . . 8 3.6. Source-specific routing . . . . . . . . . . . . . . . . . 9 3.7. Interoperability . . . . . . . . . . . . . . . . . . . . 9 4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 9 5. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 11 1. Introduction: why Babel doesn't care The core of the Babel routing protocol [RFC6126] consists of a distance vector protocol based on a distributed variant of the Bellman-Ford algorithm combined with a loop avoidance and a blackhole elimination mechanism. As long as these mechanisms are left intact, Babel will push packets in roughly the right direction following a loop-free path. Babel doesn't care what happens in the network, in the metric, or in the route selection policy, or whether routing information is propagated in a timely manner through the network. In particular, Babel has been shown to work on unstable networks, on wireless networks, on mesh networks, on hybrid networks (networks composed of an unstable meshy bit combined with a stable wired infrastructure), on overlay networks, and has been extended with source-specific routing. Some of these extensions involved fundamental changes to the structure of metrics or even the forwarding semantics, but, to a great extent, Babel just didn't care. This is in contrast with familiar link-state IGPs, such as OSPF and IS-IS. These link-state protocols deeply care that all nodes in a given area have their link state databases synchronised in a timely manner, lest routing loops and other pathologies occur. OSPF and IS- IS therefore require careful (see?) implementation of timely and Chroboczek Expires October 6, 2015 [Page 2] Internet-Draft Babel doesn't care April 2015 reliable flooding, do not support dynamically computed metrics without careful attention to stability issues, do not allow any form of filtering or non-default route selection except at area boundaries, and require a distributive metric. 1.1. Consequences and applicability of the protocol Babel doesn't care where the routing information comes from -- whatever information it is given, it maintains a set of loop-free paths. As we show in Section 3, this has allowed Babel to either work out of the box or be easily adapted to a wide range of situations that are difficult or outright impossible to solve with the familiar link-state routing protocols. This enormous flexibility of the protocol makes Babel particularly adapted to tricky networks where the features of familiar routing protocols (shortest-path next-hop routing over transitive links with statically defined metrics) are not good enough. This can include mesh networks, overlay networks, radio networks, etc. Since all of these kinds of networks are handled by a single protocol, Babel is particularly applicable to hybrid networks, whose different parts need different algorithms, for example networks composed of a meshy part and a stable, wired part. Babel is not necessarily the best choice in large, stable, carefully administered networks, where a more rigid protocol such as OSPF or IS-IS will have lower overhead and be easier to debug. 2. Structure of the Babel routing protocol The Babel routing protocol [RFC6126] is a loop-avoiding distance vector protocol. The core Babel protocol is reasonably simple (it is described in 30 pages in RFC 6126 and has been reimplemented from scratch in slightly over 20 hours by Markus Stenberg), and consists of the following components: o a distance vector routing protocol that is metric-agnostic and supports flexible routing policies; o explicit mechanisms (Hello/IHU) for link sensing and bidirectional reachability detection; o a loop-avoidance mechanism that makes strong guarantees even in transient state; o a provably complete blackhole avoidance mechanism. Chroboczek Expires October 6, 2015 [Page 3] Internet-Draft Babel doesn't care April 2015 In the rest of this section, we describe these mechanisms in more detail. 2.1. Distance-vector protocol At the core of Babel lies a distance-vector routing protocol, based on a distributed variant of the Bellman-Ford algorithm, similiar to the core of BGP or EIGRP. Distance-vector IGPs have become unfashionable sometime in the 1990s. However, distance vector has a number of very pleasant properties: it is simple to implement, reasonably robust in the presence of implementation mistakes, and allows the use of redundant routing tables (routing tables with multiple routes to a single destination), which makes reconvergence after a link failure almost instantaneous (no packet exchanges) when the alternate routes are already available. Another important property is that, unlike link-state, distance vector builds its routes consistently with the direction of forwarding in next-hop routing. This makes it robust in the presence of routing inconsistencies, a property on which Babel relies when it delays sending updates or uses a non-distributive metric (see Section 3.4 below). 2.2. Link sensing Like most modern routing protocols, Babel has a simple and lightweight mechanism for detecting neighbours and eliminating assymetric neighbours. Every node sends periodic Hello TLVs; somewhat less often, every node sends IHU ("I Heard You") TLVs that contain the list of all the nodes from which it has heard a Hello. Hello TLVs carry a sequence number (not to be confused with the sequence number used for loop avoidance). Additionally, both Hello and IHU TLVs carry an explicit interval -- an upper bound on the time after which a new Hello or IHU can be expected. This allows nodes with different parameters to reliably establish neighbour relationships, since the hold time for a neighbour can be determined dynamically from the explicit interval. Additionally, the Hello seqno allows a node to vary its Hello to send unscheduled Hellos without confusing its neigbours, for example after detecting a mobility event. Both Hello and IHU TLVs are extensible. For example, the delay-based routing extension [BABEL-RTT] uses Hellos and IHUs to carry timestamps and timestamp echoes respectively. Chroboczek Expires October 6, 2015 [Page 4] Internet-Draft Babel doesn't care April 2015 2.3. Metric agnosticity Distance vector's robustness and the extensibility of Babel's packet format conspire to make Babel metric-agnostic: a metric in Babel doesn't need to be an integer, it can be any sort of algebra that is equipped with a strictly-monotonic ordering (Section 3.5.2 of [RFC6126]). In particular, a Babel metric need not be distributive -- a property that is used by Babel's radio interference extension [BABEL-Z]. More about this in Section 3.4 below. What is more, the loop-avoidance features of Babel ensure that a packet is pushed in roughly the right direction, following a loop- free path, even when the metric is extremely unstable. This property is what makes delay-based routing work reliably even in the presence of congestion. 2.4. Flexible route selection policies Distance vector is flexible, in the sense that routes can be dropped ("filtered") out at any point in the network, and that a node can that is unwilling to forward packets may advertise an artificially inflated metric ("prepending"). This is very different from link state, where filtering can only be done at area boundaries. The reference implementation of Babel includes a powerful language for filtering and prepending, and this feature has proven to be extremely useful to work around flaws in the data plane (e.g. for discarding NATed or firewalled routes). What is more, route selection need not choose the route with smallest metric -- any route with finite metric will do, as long as the choice is consistent with the data plane. This property is used by Babel to improve stability by applying hysteresis during route selection. 2.5. Loop avoidance Like EIGRP and BGP, Babel is a loop-avoiding protocol: only those routes are accepted that can be proven to be loop-free. A route that has this property is said to be "feasible". Babel's feasibility condition is based on the one used in EIGRP [DUAL], but is structured similarly to DSDV [DSDV]. Thus, Babel avoids EIGRP's active phase (not needed with DSDV-style feasibility) while also avoiding DSDV's issues with retractions (the even-odd mechanism of DSDV is not necessary with EIGRP-style feasibility) and minor metric fluctuations (EIGRP-style feasibility can cushion up to one hop of fluctuation). Chroboczek Expires October 6, 2015 [Page 5] Internet-Draft Babel doesn't care April 2015 Babel's loop-freedom properties are well understood (Section 1.1 of [RFC6126]) and have been proven (the proof has not been published yet). 2.6. Blackhole elimination It is well known that DSDV-style loop avoidance, as used in Babel, has an issue: it may sometimes cause spurious blackholes. This situation is known as "starvation" in Babel (Section 2.5 of [RFC6126]). DSDV attempted to work around the issue by updating a route's seqno periodically, which made DSDV unsuitable for high-mobility scenarios: after moving a DSDV laptop to a new place, the user would need to wait for up to 30 seconds for a new seqno to be generated and propagate across the network. What is more, the need to propagate new seqnos in a timely manner put an upper bound on DSDV's update interval. Babel uses a different mechanism for clearing spurious blackholes, which is based on the observation that it should be the starving node that triggers a seqno update. When a node detects that it is suffering from starvation (all the routes to a given destination are unfeasible), it sends an end-to-end "seqno request" that requests that the source increase its seqno number. Since seqno requests follow routes that have not been proven to be loop-free, they are equipped with a hop count; combined with duplicate request detection, this mechanism is complete and reasonably efficient. 3. Examples of how Babel doesn't care In the rest of this document, we describe a number of situations in which Babel just doesn't care. 3.1. Lossy networks Babel was designed to support lossy networks well, and one of the metrics suggested in Appendix A of [RFC6126] depends on packet loss. This metric tends to fluctuate in normal usage, but two features of Babel conspire to make it work well: Babel's feasibility function is able to keep a route feasible when the metric doesn't fluctuate by more than the cost of one hop, and Babel's loop avoidance ensures that packets are pushed in the right direction even when the metric fluctuates. The metric may fluctuate -- but Babel doesn't care. Chroboczek Expires October 6, 2015 [Page 6] Internet-Draft Babel doesn't care April 2015 3.2. Non-transitive (mesh) networks A non-transitive (or mesh) network is one in which there is no well- defined notion of link -- where the neighbour relationship between interfaces is not transitive. Routing protocols such as OSPF and IS- IS require a transitive network. Babel just doesn't care. (In fact, it does care a little -- the split horizon optimisation can be enabled on a transitive, lossless link.) 3.3. Non-default route selection policies Babel doesn't require that a router always pick the route with smallest metric. The route selection procedure uses this property to prefer stable routes (routes that have been around for a while) to potentially unstable ones (routes that have popped up recently), and could potentially take into account information that has been obtained from other sources than the protocol (say, battery charge information). Choosing a path other than the shortest one would break a link-state protocol. Babel doesn't care. 3.4. Non-distributive metrics In most routing protocols, metrics consist of some set of bounded integers equipped with the usual addition. But there is no requirement that the metric have such a simple structure. Babel will work with any metric that is strictly monotonic on the left: m < n.m Link-state protocols make one further requirement on the metric, called left distributivity or isotonicity, which is not required by distance vector protocols: m <= m' ---> n.m <= n.m' To see how a non-distributive metric can arise, consider Babel's radio interference extension [BABEL-Z], which aims to minimise the number of times a route crosses a given frequency. Suppose that A is a single radio router, while routers B and C have two radios; suppose further that the dashed frequency has less packet loss than the dotted frequency: Chroboczek Expires October 6, 2015 [Page 7] Internet-Draft Babel doesn't care April 2015 q p ----- A ----- B ..... C q' Then the preferred route from A to C is p.q', while the preferred route from B to C is q. The metric is non-distributive: q <= q' but p.q > p.q' (A note for BGP specialists: this is analogous to what happens in BGP when q is a customer route, and is therefore preferred by B, while q' is a shorter peer route, and would therefore be preferable for A. Many familiar BGP policies can be modelled by non-distributive metrics.) Using a non-distributive metric is impossible in link-state protocols, where it may lead to persistent routing loops. Babel doesn't care, and converges to a Nash equilibrium. (This assertion is currently unproven, but similar properties are known to hold for Bellman-Ford.) 3.5. Delay-based routing Overlay networks -- networks built over tunnels or VPNs -- present a complex network topology as a single hop. Efficient routing in such networks requires distinguishing between local and remote hops, which is usually done by a human administrator manually tweaking the metric. The delay-based extension to Babel [BABEL-RTT] [DELAY-BASED] uses RTT measurements to automatically prefer local routes. Such an approach leads to a negative feedback loop between the control plane and the data plane, which naturally leads to routing oscillations (a route is preferred when it has low RTT, which causes traffic to flow through it, which in turn causes its RTT to increase, which causes a different route to be preferred). In a link-state protocol, such oscillations would lead to repeated refloodings and SPF recomputation. Babel, on the other hand, doesn't care: even in the presence of oscillations, it is pushing packets in the right direction following loop-free paths. (Oscillations cause packet reordering, which is something the transport layer cares about, so while Babel itself might not care, we have equipped our implementation with a number of mechanisms to limit the amount of oscillation.) Chroboczek Expires October 6, 2015 [Page 8] Internet-Draft Babel doesn't care April 2015 3.6. Source-specific routing On advice from the Homement working group, we have extended Babel with support for source-specific routing [BABEL-SS] [SS-ROUTING]. This is a sizeable extension: it changes the behaviour of the data plane (packets are forwarded according to both source and destination), it changes the notion of most-specific route, and it requires no less than three new TLVs to be carried by the protocol. In this particular case, Babel does care, to a certain extent. The new TLVs needed to be very carefully encoded so that even in the presence of non-source-specific routers, the data plane remains consistent with the control plane, and we needed to make some efforts to ensure that the blackhole elimination mechanism wouldn't be broken by source-specific routing. From a different point of view, however, Babel still doesn't care: the loop avoidance mechanism applied with no changes to the new kind of routes. 3.7. Interoperability All of the extensions to Babel described above interoperate, to the extent possible. For example, a delay-based router will reliably exchange routes with a non-delay-based one, but will not perform any RTT estimation. Similarly, a source-specific router will reliably exchange non-specific routes with a non-source-specific router. While achieving this requires careful encoding of the extension data so that just the right bits are discarded by routers that don't understand the extension (Section 4 of [BABEL-EXT]) and checking that the forwarding behaviour remains consistent with the routing data, Babel itself doesn't care -- it just computes paths and applies the loop avoidance mechanism, whatever the source of the routing information. 4. Conclusion We have solved within the Babel routing protocol a number of interesting routing problems that are difficult or outright impossible to solve in other routing protocols, and we have done that without breaking interoperability between the various variants of the protocol. We claim that this makes Babel particularly suitable for the kind of messy networks where such problems arise, and in particular to hybrid networks, where different parts of the network require different mechanisms to work well but don't necessarily wish to run multiple routing protocols. But that's us. Babel doesn't care. Chroboczek Expires October 6, 2015 [Page 9] Internet-Draft Babel doesn't care April 2015 5. References [BABEL-EXT] Chroboczek, J., "Extension Mechanism for the Babel Routing Protocol", draft-chroboczek-babel-extension-mechanism-04 (work in progress), March 2015. [BABEL-RTT] Jonglez, B. and J. Chroboczek, "Delay-based Metric Extension for the Babel Routing Protocol", Internet Draft draft-jonglez-babel-rtt-extension-00, July 2014. [BABEL-SS] Boutier, M. and J. Chroboczek, "Source-Specific Routing in Babel", draft-boutier-babel-source-specific-00 (work in progress), November 2014. [BABEL-Z] Chroboczek, J., "Diversity Routing for the Babel Routing Protocol", draft-chroboczek-babel-diversity-routing-00 (work in progress), July 2014. [DELAY-BASED] Jonglez, B., Boutier, M., and J. Chroboczek, "A delay- based routing metric", March 2015, . Submitted for publication. [DSDV] Perkins, C. and P. Bhagwat, "Highly Dynamic Destination- Sequenced Distance-Vector Routing (DSDV) for Mobile Computers", ACM SIGCOMM'94 Conference on Communications Architectures, Protocols and Applications 234-244, 1994. [DUAL] Garcia Luna Aceves, J., "Loop-Free Routing Using Diffusing Computations", IEEE/ACM Transactions on Networking 1:1, February 1993. [RFC6126] Chroboczek, J., "The Babel Routing Protocol", RFC 6126, February 2011. [SS-ROUTING] Boutier, M. and J. Chroboczek, "Source-sensitive routing", March 2015, . To appear in Proc. IFIP Networking 2015. Chroboczek Expires October 6, 2015 [Page 10] Internet-Draft Babel doesn't care April 2015 Author's Address Juliusz Chroboczek PPS, University of Paris-Diderot Case 7014 75205 Paris Cedex 13 France Email: jch@pps.univ-paris-diderot.fr Chroboczek Expires October 6, 2015 [Page 11]