The ISP Column 
A monthly column on things Internet
 
BGP in 2008

March 2009
Geoff Huston

        Here in my part of the world the season has well and truly
        turned from summer to autumn, which means that another year
        has come and gone. I thought that it might be time to give MTU
        examination a rest for a month or more and instead review the
        last 12 months in BGP-land and see what's been happening
        there.

  BGP has been toiling away, literally holding the Internet together,
  for close to two decades now, and nothing seems to be falling off
  the edge of the Internet. So why should we be interested in the
  growth trends for BGP? Here's some possible reasons why this data
  can be useful for folk in Internet business.

  For the ISP network operator, this information may be help in
  figuring out how big a router should you buy today if you want it to
  still cope with the full BGP routing load in 3 - 5 years
  time. Perhaps you might want to work out what FIB size is necessary
  in that time, and what TCAM size is necessary, in which case you may
  want to have a conservative estimate of the anticipated number of
  entries in the routing table over that period. If this applies to
  customers of routing equipment, the same applies to a vendor of such
  equipment: How big a router should a vendor build to cope with the
  BGP load over the next 3 - 5 years? What are the Internet's scaling
  factors at play here?

  Underlying this questions are a more basic set of questions about
  BGP itself. Is BGP scaling or is it failing? Do we need to develop a
  new Inter-Domain Routing protocol to take over from BGP? If so, how
  much time do we have before a new approach is needed? And if we are
  going to head down this path is the problem simply one of routing
  over an ever larger and more diverse population, or is this an
  expression of a more fundamental scaling limitation of the
  Internet's current concepts of names and addresses? In other words,
  if we are facing a major problem with routing scalability do we need
  now to examine alternate models of identity and location separation
  in order to build truly massive and highly diverse networks? Or, is
  routing scaling an intractable problem within the confines of the
  current architecture and we need to shift around the basic building
  blocks of the Internet architecture in order to allow a different
  routing architecture that has radically different scaling
  properties?

  These questions were studied by the Internet Architecture Board at
  its workshop in October 2006, which was written up as RFC4984, and
  projections of routing table inflation in the coming years were a
  source of considerable concern:

   The workshop participants believe that routing scalability is the
   most important problem facing the Internet today and must be solved,
   although the time frame in which these problems need solutions was
   not directly specified.  The routing scalability problem includes the
   size of the DFZ RIB and FIB, the implications of the growth of the
   RIB and FIB on routing convergence times, and the cost, power (and
   hence, heat dissipation) and ASIC real estate requirements of core
   router hardware.

   It is commonly believed that the IPv4 RIB growth has been constrained
   by the limited IPv4 address space.  However, even under this
   constraint, the DFZ IPv4 RIB has been growing at what appears to be
   an accelerating rate [DFZ].  Given that the IPv6 routing architecture
   is the same as the IPv4 architecture (with substantially larger
   address space), if/when IPv6 becomes widely deployed, it is natural
   to predict that routing table growth for IPv6 will only exacerbate
   the situation.

   RFC4984: Report from the IAB Workshop on Routing and Addressing, September 2007

  
  At the time the picture was not looking overly optimistic for the
  longer term prospects of BGP, and the workshop prompted further
  studies of routing techniques and architectures that were capable of
  sustaining a greater level of information aggregation.

   First of all, the workshop participants would like to reiterate the
   importance of solving the routing scalability problem.  They noted
   that the concern over the scalability and flexibility of the routing
   and addressing system has been with us for a very long time, and the
   current growth rate of the DFZ RIB is exceeding our ability to
   engineer the routing infrastructure in an economically feasible way.
   We need to start developing a long-term solution that can last for
   the foreseeable future.

   RFC4984: Report from the IAB Workshop on Routing and Addressing, September 2007

  But I wasn't going to head in that direction of describing the areas
  of possible direction for future routing systems in this
  article. The question I'd like to ask here is somewhat more
  pragmatic in nature: has anything changed in this perspective on
  BGP? Are the prospects of the medium term collapse of BGP through
  scaling overload still a realistic option for the routing
  environment? Should we still be concerned about routing scaling? Is
  the BGP sky about to fall on our heads?


The BGP Measurement Environment

  In trying to analyse long baseline data series, the ideal approach
  is to keep as much of the local data gathering environment as stable
  as possible so that the changes that occur in the collected data
  reflect the larger environment and not the local configuration of
  the data collection equipment. In this case the measurement point
  being used is a BGP router configured as AS2.0 (or AS131072 if you
  prefer!). This AS generates no traffic and originates no routes in
  BGP. It's a passive measurement point that has been logging all
  received BGP updates since 1 July 2007, and is the successor to an
  earlier setup located in AS1221. The router is fed with a
  default-free eBGP feed from AS 4608, which is the APNIC network
  located in Australia, and AS 4777, which is the APNIC network
  located in Japan, as AS1280, a RIPE RIS Route Collector.  For IPv6
  routes the measurement system is being fed with complete route sets
  from AS1221 (Telstra), AS1280 (RIPE NCC), and AS5539 (ISC). My
  thanks to these folk for their willingness to fee me routing data
  for this work.

  What is being used here is a single view of the "edge" of the
  network, looking at an eBGP perspective, as distinct from a mixed
  eBGP / iBGP environment. This AS is not an upstream for anyone else,
  so it has no transit role, and does not have a large set of BGP
  peers.

  There is also no iBGP in this setup. While it has been asserted at
  various times that iBGP is a major contributor to BGP scalability
  concerns in BGP, the consideration here in trying to place some data
  against this assertion is that there is no "standard" iBGP
  configuration, and each network has its own rather unique
  configuration of Route Reflectors and iBGP peers.  This makes it
  hard to generate a "typical" iBGP load profile, let alone analyse
  the general trends in iBGP update loads. In this study the scope of
  attention is limited to simple eBGP configuration that is likely to
  be found at a "stub" AS at the edge of the Internet, and the effects
  of iBGP are not included in these measurements.

  The measurement system took a snapshot of the BGP RIB every hour, as
  well as logging all received BGP updates.

IPv4 BGP Table Data

  The following tables show some of the vital statistics for IPv4 in
  BGP over the past 12 months. In and of themselves the graphs are not
  that informative. The graphs show relatively stable increase in most
  of the routing metrics. The discontinuities in March, April December
  was caused by the measurement environment adding and dropping a BGP
  peering session with AS1280, rather than any shift in the
  characteristics of the network itself.

   Figure 1 - IPv4 BGP Routing Table Size (RIB)
   Figure 2 - IPv4 BGP Routing Table  - More Specific Entries
   Figure 3 - IPv4 AS Count
   Figure 4 - IPv4 Transit  AS Count
   Figure 5 - IPv4 Advertised Address Space
   Figure 6 - IPv4 Average AS Path Length

  The summary of the IPv4 BGP network for 2008 is

                       Jan-08     Dec-08   2008   2005
   Prefix Count       245,000    286,000    17%    18%
     Root Prefixes    118,000    133,000    13%    18%
     More Specifics   127,000    152,000    20%    18%
   Address Span (/8s)  106.39     118.44    11%    10%
   AS Count            27,000     30,200    11%    14%
     Transit AS Count   3,600      4,100    14%    14%
     Stub AS Count     23,400     26,200    11%    14%

  What this table indicates is that for the IPv4 Internet the use of
  aggregates in the routing system has not improved over 2008, nor has
  it become significantly worse. The average size of advertisements is
  getting smaller in terms of address span per routing table entry,
  the span of originating addresses per AS is getting smaller, the
  average AS path length is constant at around 5 AS hops (which would
  translate to 4 AS hops if the measurement setup overhead was
  removed) and the number of AS's is increasing, and the
  interconnection degree of AS's is getting higher. The implication is
  that the granularity of the inter-domain routing system continues to
  get finer and the density of interconnection is getting greater. In
  other words the growth of the Internet is not "growth at the edge"
  and the network is not getting any larger in terms of average AS
  path change. Instead, the growth is happening by increasing the
  density of the network by attaching new networks into the existing
  transit structure and peering at established exchange points. This
  makes for a network whose diameter, measured in AS hops, is
  essentially static, yet whose density, measured in terms of prefix
  count, AS interconnectivity and AS Path diversity, continues to
  increase.

  The growth metrics of the routing system in 2008 are not overly
  different from that of 2005 in terms of the growth of the routing
  table and the span of announced addresses. The growth rate of the
  transit ASs is slightly lower than in 2005, but not significantly
  so.

IPv6 BGP Table Data

  A similar exercise has been undertaken for IPv6 routing data, and
  the comparable figures are shown below.

  
    Figure 7 - IPv6 BGP Routing Table Size (RIB)
    Figure 8 - IPv6 BGP Routing Table  - More Specific Entries
    Figure 9 - IPv6 AS Count
    Figure 10 - IPv6 Transit  AS Count
    Figure 11- IPv6 Advertised Address Space
    Figure 12 - IPv6 Average AS Path Length

  The summary of the IPv6 Internet for 2008 is as follows:

  
                        Jan-08    Dec-08  2008  2005
    Prefix Count         1,050     1,600   52%   21%
      Root Prefixes        840     1,300   55%   15%
      More Specifics       210       300   43%   51%
   Address Span         /16.67    /16.65    1%   50%
   AS Count                860     1,230   43%   20%
      Transit AS Count     240       310   29%   21%
      Stub AS Count        620       920   48%   18%

  It is harder to make generalizations about the trends in the IPv6
  network over 2008, as the IPv6 network is simply not large enough to
  show any overall trend behaviour as yet. Certainly the rate of pick
  up is higher than the comparable statistics in the IPv4 network, and
  the annual rate of increase is higher than was seen in 2005. This is
  encouraging news if you are looking for positive signs of IPv6
  update in the Internet, but in absolute terms the metrics still fall
  far short of the comparable metrics of the IPv4 Internet.

Projecting the BGP Size

  What can this data tell us in terms of projections of the future of
  BGP in terms of BGP table size?

  The technique used here is to take the hourly snapshots of the BGP
  table size and firstly filter the data to remove some anomalous
  entries related to additional routes visible from AS12054 but not
  globally visible, then apply a filter that generates a daily average
  table size, then applies a smoothing function across the data, using
  a 60 day value as the parameter to the multi-day smoothing
  function. This has been done using an extended data set that cover
  the past 60 months. The result of this function applied to the IPv4
  BGP table is shown in the following figure.

    Figure 13 - Smoothed IPv4 BGP Table Size

  The first order differential of the smoothed data is then taken, as shown in the red line the following figure.

    Figure 14 -First Order Differential of  Smoothed IPv4 BGP Table Size

  The longer term trend of this first order differential is a linear
  function, shown in green in the above figure. A linear first order
  differential (dy/dx = ax+b) implies a fit of a quadratic function to
  the data (y = a/2 x**2 + bx + c).

  This quadratic function can then be used to create a forward projection of the table size, shown as the blue line in the following figure.

    Figure 15 -Prediction of IPv4 BGP Table Size

  This same predictive exercise was undertaken in January 2006, and
  the following table shows the predictions generated from the current
  data and those generated using the same approach three years earlier

	      Jan 2009 prediction Jan 2006 prediction
    Jan 2009      285,000              275,000
    Jan 2010      335,000              322,000
    Jan 2011      388,000              370,000
    Jan 2012      447,000                 *   
    Jan 2013      512,000                 *   

  There is a relatively good correlation between the numbers predicted
  by a quadratic growth model of BGP table size in 2006 with the data
  in the period 2006 - 2009, and a reasonably good correlation of
  predictions for the next two years.

  With the caveat that this prediction is based on the assumption that
  tomorrow is a lot like today and that the influences that shape
  tomorrow have already shaped today, then its reasonable to predict
  that the routing table in two years time, at the start of 2011, will
  contain an additional 100,000 entries, making a total for IPv4 of
  some 388,000 entries.

  However I'm not anywhere near as confident in making predictions
  beyond 2011, and certainly not all that confident in the predictions
  generated by this model for January 2012 and January 2014. The
  problem is that another predictive model, that of the consumption of
  as-yet unallocated IPv4 addresses, predicts the effective exhaustion
  of the unallocated IPv4 number pool in 2011 / 2012. It is not
  possible to use the current models of BGP growth to peer into this
  post-exhaustion IPv4 routing environment, so the numbers given in
  the table above for January 2012 and 2013 are extremely uncertain.

  Perhaps there is another way of looking at this. If one assumes that
  the major objective here is to ensure that the "unit cost" of
  routing continues to decline over time, or at least remain constant,
  what benchmark could be used to compare the BGP prediction against
  in terms of a constant unit cost curve?

  One possible model that could be used as a benchmark of a prediction
  of constant unit cost in terms of this form of routing and packet
  forwarding hardware in packet networks is Moore's Law . Here the
  general assumption is that as long as the growth parameters of the
  routing table sit within the parameters of Moore's law then the
  expectation is that the unit cost of routing and switching hardware
  should not escalate to any appreciable extent. The following figure
  compares the quadratic projection model of the size of the BGP
  default free zone with an exponential model of doubling every two
  years, as used in Moore's Law. As can be seen in the figure below
  there is no real cause for alarm at this stage, and the BGP table
  size appears to fit comfortably within these parameters within the
  current projection model.

    Figure 16 -Comparison of BGP RIB prediction to Moore's Law Growth

  Of course, if address exhaustion causes a rapid doubling on the
  routing table across 2011, inflating the routing table by early 2012
  to a size of 1 million entries or more, then this would represent a
  somewhat different scenario.  What could potentially drive this
  rapid inflation scenario is some form of IPv4 address redistribution
  function that as focussed solely on the public addressing
  requirements in IPv4 NAT scenarios, probably increasing the
  prevalence of global routes at the /24 level, or potentially at even
  smaller sizes, coupled with a scenario of very rapid level of uptake
  across the global IPv4 BGP routing table. Such scenarios are related
  to the levels of speculation concerning the industry reaction to the
  exhaustion of the existing mechanism of IPv4 address distribution,
  and at this point in time the level of speculation about the nature
  of the redistribution function and the pressures placed on the
  routing space in consequence is extremely high indeed.

Measuring BGP Updates

  Whenever this discussion about routing scalability takes place,
  there is a related discussion about what aspect of scaling is being
  discussed. Is it really the size of the routing space that is the
  topic of deep concern, or is it the dynamic properties of the
  routing system? Should we be looking at the average time to reach
  convergence? Or the volume of BGP update message per unit time?

  Part of this measurement exercise has been to collect every BGP
  update. The figure below shows the number of BGP updates per day, or
  to be more precise, the number of prefix updates per day over
  2008. This is shown in the following figures.

    Figure 17 -Daily BGP Updated Prefix Counts for 2008

  The first view is the number of updated prefixes per day in BGP
  (Figure 17). At this scale, the daily withdrawal rate is relatively
  constant, while the number of updates per day shows a number of
  extreme outliers.

  On investigation, these outliers are attributable to session resets
  in the local measurement setup, where the local BGP system performs
  a reset and is re-fed the complete route set. ON some occasions
  there were multiple resets in the day, including one day where the
  BGP table was reloaded 9 times. These local session reset updates
  can be filtered out from the data set, to produce the following view
  of the number of updated prefixes per day in BGP for 2008, and a
  best fit can be applied to the data, using a least squares best fit.

     Figure 18 -Daily BGP Updated Prefix Counts for 2008

  This data shows a daily rate of 89,000 updated prefixes per day, or
  an average prefix update rate at a level of slightly over 1 a
  second. Obviously this has very little relationship to the actually
  update rate that a BGP speaker is likely to see, but it is a useful
  metric in looking at the order of scale of the processing load
  imposed by the flow of BGP updates. This update data can be extended
  back in time using data collected from previous years, again using
  the same techniques of filtering out local BGP reset traffic and
  applying a least squares best fit to the data.

     Figure 19 -Daily BGP Updated Prefix Counts for 2005 - 2008

  The forward projection of the number of BGP prefix updates is shown
  in the following figure, this time using a linear function derived
  from a least squares best fit to the daily data.

     Figure 20 -Daily BGP Updated Prefix projection

  The update data shows a surprisingly consistent view of BGP updates
  with very slight growth projected in the coming years, based on the
  data from previous years. If there is a looming issue with BGP
  update processing loads in the coming years, the rate of eBGP
  updates that are unrelated to local BGP session resets does not
  appear to be a strong contributor to any such issue.

  A similar story relates to withdrawals. The daily count of
  withdrawals over 2008 is shown below, and the projection into the
  coming years is also shown in the following figures.

    Figure 21 -Daily BGP Prefix Withdrawal Counts for 2008

    Figure 22 -Projected Daily BGP Prefix Withdrawal Counts

  Here a growth trend is visible, to some extent, and while the 2008
  data may suggest a 10% annual growth rate, an 18 month window of the
  data suggests a more conservative view of growth in the number of
  withdrawals at less than 5%, with the number of withdrawals per day
  rising from an average of 8,500 a day to 9,800 in the next four
  years.

  Why is this projected growth rate so much smaller than the projects
  for the growth in the BGP table size?

  Surely a more richly connected, larger routing space would generate
  more routing protocol update traffic. Wouldn't there be more
  prefixes? And wouldn't each prefix generate more updates as a result
  of BGP's distance vector algorithm attempting to reach convergence? 
  Wouldn't the interaction between a larger routing space and the
  Minimum Router Advertisement Interval (MRAI) default timer settings
  on commonly deployed routing equipment delay convergence as the
  network itself grew?

  One way of looking at this is to look at the average number of BGP
  updates required to reach a converged, or stable, routing state, and
  the average amount of time taken for routing to reach
  convergence. Here "stability" is defined as no further updates for
  130 seconds or longer. If these suspicions about the behaviour of
  BGP have any substance, then some form of inflation of these two
  metrics should be visible in the 2008 data.

  The following figures show the base data, which is the daily number
  of 'instability events' where a prefix took two or more updates
  before reaching a converged state, and the daily average of the
  number of updates seen before a prefix is considered stable, and the
  average amount of time taken. Here the single update events where a
  prefix moves from a stable state to a new stable state are not
  included in the data set. This data looking at update sequences of
  two or more updates, including withdrawals) using this 130 second
  definition of 'stability.

  The number of these 'instability' events appears to be relatively
  constant on a daily basis. In other words the network as a whole
  appears to be no more or less unstable at the end of 2008 as it was
  at the start of 2008, with around 23,000 to 35,000 such events per
  day.
  
     Figure 23 Number of discrete BGP Update sequences per day

     Figure 24 -Daily Average of BGP Updates to reach convergence

     Figure 25 - Daily Average of elapsed seconds to reach convergence

  There is a marked anomaly in the data on 1 April, and the network
  convergence times were significantly improved after that date. On
  that date the BGP peering with AS1280 was shut down. It was restored
  a month later, and shut down again in mid December 2009. It appears
  that this peering session was adding some additional instability
  into the measurement environment in the first four months of the
  year.

  The next three figures show a least squares best fit across the
  data. In the case of the number of instability events this is drawn
  across the full year, and in the case of the number of updates and
  the average convergence time it is drawn across the period from
  April to December.

     Figure 26 - Trend of number of BGP convergence sequences

     Figure 27 - Daily Average of elapsed seconds to reach convergence - fit to linear model

     Figure 28 - Daily Average of elapsed seconds to reach convergence - fit to linear model

  This shows an increase of 6 seconds, from 66 seconds to 72 seconds
  over a 9 month period for this convergence metric, and a comparable
  increase in the average update count from 2.46 to 2.59 updates.

  I suspect that the underlying relationship here is between routing
  convergence and average AS Path length. While the AS Path length
  remains constant then the dynamic behaviour of BGP to propagate
  information remains bounded in some sense.

    Figure 29 - Average AS Path Length, as seen by each Route Views
    BGP Peer

  As shown in the figure above, the Internet has been remarkably
  steady in terms of the average AS Path Length metric for more than
  10 years. Over the same period the number of routing entries has
  grown from 50,000 to 300,000 entries, yet the "diameter" of the
  Internet has remained relatively constant, and all the routing
  domain growth has increased its "density" of interconnection rather
  than its radial length. In such an environment BGP has been able to
  scale very effectively, as the limits to the amount of update
  traffic required for BGP to reach convergence appears to be more
  strongly related to the "radial length" of the Internet, in terms of
  AS hop count, than it is related to the "density" of the Internet,
  in terms of AS interconnectivity metrics.

  As long as this network characteristic is preserved, it appears the
  BGP can continue to function very effectively.

BGP in 2008

  I'm not sure I could say that BGP is on a sure path to perdition,
  based on the data collected in 2008 relating to the growth in the
  routing system and the dynamic behaviour of BGP. None of the metrics
  indicate that we are seeing such an explosive level of growth in the
  routing system that it will fundamentally alter the viability of
  carrying a complete eBGP routing table in the near future, nor do
  the characteristics of convergence behaviour show any sign of the
  Internet entering into a phase of uncontrollable route instability.

  At least for 2008 the BGP sky did not fall on our heads, and the
  signs for 2009 are looking good, so far!

  
  Disclaimer

    The above views do not necessarily represent the views of the Asia
    Pacific Network Information Centre.

   
  About the Author

    GEOFF HUSTON holds a B.Sc. and a M.Sc. from the Australian
    National University. He has been closely involved with the
    development of the Internet for many years, particularly within
    Australia, where he was responsible for the initial build of the
    Internet within the Australian academic and research sector. He is
    author of a number of Internet-related books, and is currently the
    Chief Scientist at APNIC, the Regional Internet Registry serving
    the Asia Pacific region. He was a member of the Internet
    Architecture Board from 1999 until 2005, and served on the Board
    of the Internet Society from 1992 until 2001.  

    www.potaroo.net