The ISP Column
A column on things Internet
|
|
|
Why is Securing BGP just so Damn Hard?
August 2019
Stories of BGP routing mishaps span the entire thirty-year period that we’ve been using BGP to glue the Internet together. We’ve experienced all kinds of route leaks from a few routes to a few thousand or more. We’ve seen route hijacks that pass by essentially unnoticed, and we’ve seen others that get quoted for the ensuing decade or longer! There are ghost routes and gratuitous withdrawals. From time to time we see efforts to craft BGP packets of death and efforts to disrupt BGP sessions through the injection of spoofed TCP resets. After some 30 years of running BGP it would be good to believe that we’ve learned from this rich set of accumulated experience, and we now understand how to manage the operation of BGP to keep it secure, stable and accurate. But no. That's is not where we are today. Why is the task to secure this protocol just so hard?
Are we missing the silver bullet that would magically solve all these BGP issues? If we looked harder, if we spent more money on research and tried new approaches, then would we find the solution to our problems? I doubt it. It’s often the case that those problems that remain unsolved for such a long time are unsolved because that are extremely hard problems and they may not even have a solution. I suspect securing BGP falls into this “extremely hard problem” category. Let’s look at this in a bit more detail to explain why I’m so pessimistic about the prospects for securing BGP.
However, perhaps we might start with a more general question: Why are some Internet issues so challenging to solve, while others seem to be effortless and appear to solve themselves? For example, why was the IPv4 Internet an unintended runaway success in the 90’s, yet IPv6 has been a protracted exercise in industry-wide indecision?
Some technologies have enjoyed success from the outset in the Internet. IPv4, of course, would be clearly placed in ther runaway success category, but perversely enough IPv6 would not. NATs have been outstanding successful, and the TCP transport protocol is still with us and it still drives the Internet. The DNS is still largely unchanged after some 30 years. More recently, content distribution systems and streaming protocols have been extremely successful, and most of today’s public Internet service could be characterized as a gigantic video content streaming network.
Why did these technologies succeed? Every case is different, of course, but there are some common success factors in all these technologies.
These success factors relate to success in a diverse, widely distributed and loosely coupled environment.
But the Internet has left behind a trail of failures every bit as voluminous, if not more so, than its history of successes. For example, spam in the email space is a massive failure for the Internet, as is our vulnerability to many forms of DDOS attacks. In a similar vein, after more than 20 years of exhortations to network operators, I think we can call spoofed source address filtering (or BCP 38) a failure. It’s very sensible advice and every network operator should do it. But they don’t. Which makes it a failure.
Secure end systems and secure networks are both failures, and the Internet of Trash looks like amplifying these systemic failures by many orders of magnitude by introduc. The broader topic of securing our transactions across the Internet also has its elements of failure, particularly in the failure of the public key certification framework to achieve comprehensive robustness. IPv6 adoption is not exactly a runaway success so far. The prospects of the Internet of Things amplifying our common vulnerability to poorly crafted, poorly secured and un-maintained endpoints should create a chilling prospect of truly massive cascading failure.
Again, there appear to be common factors for failure which are the opposite of the previous attributes. These include technologies where there is dependence on orchestration across the entire Internet, and technologies that require universal or near universal adoption. The case where there are common benefits but not necessarily individual benefits, and where there is no clear early adopter advantage lies behind the issues relating to the protracted transition to an IPv6-only Internet.
What makes a technical problem hard in this context?
So now let’s look at BGP routing security in this light. After 30 years why are we still talking about securing BGP?
Here is my top ten of the reasons why securing BGP represents such a challenging problem for us.
The task of trying to build a secure BGP system is a bit like trying to stop houses from burning. We could try to enforce behaviours of both the building industry, of our furniture and fittings and of our own behaviours that make it impossible for a house to catch fire. Or we could have a fire brigade to put out the fire as quickly as possible. For many years, we’ve opted for the latter option as being an acceptable compromise between cost and safety.
There are parallel here with BGP security. It would be an ideal situation where it would be impossible to lie in BGP. Where any attempt to synthesis BGP information could be readily identified and discarded as being bogus. But this is a very high bar to meet, and some thirty years of effort are showing just how hard this task really is.
It’s hard because no one is in charge. It’s hard because we can’t audit BGP, as we have no standard reference data set to compare it with. It’s hard because we can’t arbitrate between conflicting BGP information, because there is no standard reference point. It’s hard because there are no credentials that allow a BGP update to be compared against the original route injection, because BGP is a hop- by-hop protocol. And it’s hard because BGP is the aggregate outcome of a multiplicity of opaque local decisions.
There is also the problem that it is just too easy to be bad in BGP. Accidental misconfiguration in BGP appears to be a consistent problem, and it’s impossible to determine the difference between a mishap and a deliberate attempt to inject false information into the routing system.
We’ve become accustomed to ignoring an inter-domain routing system that can be easily compromised, as acknowledging the issue and attempting to fix it is just too hard. But maybe this passive acquiescence to BGP abuse is in fact a poor response in a broader context. If the only response that we can muster is hoping that individually our routes will not be hijacked, then we are obviously failing here.
What are the consequences of routing mishaps and malfeasance? If this is an ever-present threat, then how have we coped with it in today’s Internet?
There are three major risk factors in route hijacks: disruption, inspection and interception.
Disruption involves injecting a false route that makes the intended destination unreachable or injecting a withdrawal that also generates a similar outcome. It could be that the radius of disruption is highly localised, or it could be intended to be Internet-wide. In either case the result is that communications are disrupted, and the service is rendered unavailable.
Inspection involves an exercise of redirecting the traffic flow to a destination to pass through a network that performs traffic inspection in some manner. Depending on the form of transport level encryption that is being performed such forms of traffic inspection can be of limited value, but even the knowledge of communicating pairs as endpoints can in and of itself be a valuable source of information to the eavesdropper. Such inspection is not necessarily detectable by the endpoints, given that the packets are not altered in any manner, such their route through the network.
Interception is perhaps the more insidious threat. The threat involves the same technique of redirection of a traffic flow to a point where the traffic is intercepted and altered. Prior to the widespread use of end-to-end transport security, it could be argued that this was a thoroughly pernicious form of attack, where user credentials could be stolen, and the integrity of network transactions could be compromised. It has been argued that the widespread use of TLS negates much of this threat from interception. An interceptor would need to have knowledge of the private key of the site being attacked in order to break into a TLS handshake and inject themselves into the session in a seamless manner. But perhaps this is too glib a dismissal of this threat. Firstly, as has been seen in a number of recent attacks, many users are too quick to dismiss a certificate warning and persist when the wiser course of action would be to refrain from proceeding with the connection. Secondly, as also has been seen numerous times, not all trusted CAs are worthy of the implicit trust we all place in them. If a trusted CA can be coerced into issuing a false certificate where the private key is known to the interceptor, then the interception attack is effective even where the session is supposedly ‘protected’ by TLS.
Let’s put this together in a hypothetical attack scenario. Let’s say you find an online trusted CA that uses a DNS query as a proof-of-possession of a DNS name. This is then the criteria used by the CA to issue a domain name certificate. Let’s find a target domain name that is not DNSSEC-signed. This is of course a not uncommon criteria given the relative paucity of DNSSEC-signing in today’s DNS. A fake certificate can be generated by using a routing interception attack on the name servers of the target domain name and providing a crated response for the CA’s DNS challenge. The attacker now has a fake Certificate for the target name. Now the CA will enter this fake certificate into the certificate transparency logs, but the attacker still has enough time to launch the second part of the attack, which is an interception attack using this fake, but still trusted, certificate to intercept TLS sessions directed to the target name server.
BGP security is a very tough problem. The combination of the loosely coupled decentralized nature of the Internet and a hop-by-hop routing protocol that has limited hooks on which to hang credentials relating to the veracity of the routing information being circulated unite to form a space that resists most conventional forms of security.
It’s a problem that has its consequences, in that all forms of Internet services can be disrupted, and users and their applications can be deceived in various ways where they are totally oblivious of the deception.
It would be tempting to throw up our hands and observe that as we’ve been unable to come up with an effective response in thirty years we perhaps should just give up with the effort and concede that we just have to continue to live with a vulnerable and abused routing system.
But I’m unwilling to make that concession. Yes, this is a hard and longstanding problem, but it’s a very important problem. We will probably spend far more time and effort in trying to prop up the applications and services environment when the underlying routing infrastructure is assumed to always be unreliable and prone to various forms of abuse.
I’ll look at what we have done so far in this space and try and place current efforts into this broader context in a followup article.
The above views do not necessarily represent the views of the Asia Pacific Network Information Centre.
GEOFF HUSTON B.Sc., M.Sc., is the Chief Scientist at APNIC, the Regional Internet Registry serving the Asia Pacific region.