The ISP Column An occasional column on things Internet ________________________________________ Addressing and the Future Internet February 2007 Geoff Huston The National Science Foundation of the United States and the Organisation for Economic Co-operation and Development held a joint workshop on January 31 to consider the social and economic factors shaping the future of the Internet. I was asked to prepare a paper addressing three questions concerning addressing and the Internet, which I would like to share with you in this month’s column. Question 1: Addressing (as reflected in routing protocols, etc) is a fundamental part of the Internet. How can addressing be improved so as to improve efficiency of inter-networking and so as to face the challenges of addressing new devices and objects? Addresses in the context of the Internet’s architecture fulfill a number of roles. Addresses uniquely identity network "endpoints", providing a means of identifying the parties to a communication transaction ("who" you are, in a sense). As well as this end-to-end identification role, addresses are also used by the network itself to undertake transfer of data, where the address is used as a means of identifying the location of the identified endpoint relative to the topology of the network ("where" you are). Addresses are also used within the switching elements of the network as lookup key to perform a switching decision ("how" a packet is directed to you through the network). In other words addresses in the IP architecture simultaneously undertake the combination of "who", "where" and "how" roles in the network’s architecture. Addresses have a number of properties are all essential in the context of the integrity of the network: Uniqueness: Addresses should be uniquely deployed (considerations of anycast deployments notwithstanding). Uniqueness is not an intrinsic property of an addressing scheme per se, but is a derived property of the associated distribution and deployment framework. And addressing scheme in this context should preserve this attribute. Consistency: Addresses should be drawn from a consistent identifier space. Using multiple identifier spaces causes the potential of misinterpretation of the address. Persistence: The address value should remain constant, and gratuitous changes in the mapping from the identifier to the referenced object should be avoided. Constantly changing address-derived identities are, at the very least, very difficult to track. For how long addresses should remain persistent is something that has changed over the lifetime of the Internet. While the initial concept was that addresses should be highly persistent, current semantics indicate that addresses should remain persistent for at least the duration of a communication session. Trust: Can an address object withstand a challenge as to the validity of the address? Other parties who would like to use this address in the context of an identified endpoint would like to be reassured that they are not being deceived. 'Use' in this context is a generic term that includes actions such as resolution to the object identified by the address value, storage of the address for subsequent 'use', referral, where the address token is passed to another party for their 'use'. Robustness: The deployed address infrastructure should be capable of withstanding deliberate or unintentional attempts to corrupt it in various ways. A robust address system should be able to withstand third party efforts to subvert the integrity of the address framework as a means of undertaking identity theft or fraud. The issues, or perhaps shortfalls, with the Internet’s addressing architecture start with the collection of basic roles that are undertaken by a single IP address. This combination of "who", "where" and "how" makes for highly efficient network functions that are essential in any high speed connectionless packetized data communications system, but at the same time this semantic overload of the address in assuming the roles of "who", "where" and "how" is also the cause of considerable added complexity in today’s Internet: - Mobility remains a significant challenge in this environment, where the major attribute of any form of mobility is to preserve the notion of endpoint identity of the mobile endpoint, while allowing the network location (and related network switching decisions) to change, reflecting the changing relative location of the mobile endpoint within the network. If the endpoint location changes, then so does the "where" and "how" components of its "address". But how do you keep active sessions open, or establish new sessions with this endpoint? How can you keep the "who" component of an address constant, while at the same time changing the "where" and the "how" components? - The granularity of the addressing system represents an uncomfortable compromise. An IP address is intended to identify a device’s network interface, as distinct from the device itself or the device’s user. A device with multiple active interfaces has multiple IP addresses, and while its obvious to the device itself that it has multiple identities, no one else can tell that the multiple identities are in fact pseudonyms, and that the multiple addresses simply reflect the potential for multiple paths to reach the same endpoint. In terms of identity "who" of an address the protocol stack within the endpoint should remain a constant, while allowing for multiple "where" locations that reflect the connectivity of each of the device’s connected interfaces. - Also, the address does not identify a particular path, or set of paths through a network, or possibly even a sequence of forwarding landmarks, but simply the desired endpoint for the packet’s delivery. This has implications in terms of application performance and robustness, and also has quite fundamental implications in terms of the design of the associated routing system. The Internet’s address architecture represents a collection of design decisions, or trade-offs, between various forms of apparently conflicting requirements. For example, with respect to the routing system, the desire for extremely high speed and low cost switching implementations has been expressed as a strong preference for fixed size and relatively compact address formats. With respect to the role of addresses as identity tokens, the desire for low cost deployment and a high level of address permanence implies a strong desire for long term stable address deployments in production networks, which, in turn, is expressed as a strong desire for low levels of address utilization efficiency in deployed systems, which for large systems implies extended address formats, potentially of variable length. With respect to the IP architecture, these trade-offs in addressing design are now relatively long-standing aspects of the address, representing decisions that were made some time ago in an entirely different context to that of today’s Internet. Are these design decisions still relevant today, or are there other potential ways of undertaking these design tradeoffs that would represent a more effective outcome? Indeed if we look at future forms of network evolution are these aspects of an address invariant, or should we contemplate other address structures that represent different trade-offs in design? The changing nature of an "address" A significant issue with addressing is the address "span". While the 32 bits of the IPv4 address space represents a considerable span, encompassing some 4.4 billion unique addresses, there is an inevitable level of wastage in deployment, and a completely exhausted 32 bit address space may encompass at best some 200 to 400 million uniquely addressed IP devices. Given that the population of deployed IP devices already exceeds this number by a considerable margin, and when looking forward to a world of potentially billions of embedded IP devices in all kinds of industrial and consumer applications, this 32 bit address space is simply inadequate. In response, we've seen the deployment of a number of technologies that deliberately set out to break any strong binding of IP address with persistent endpoint identity, and treat the IP address purely as a convenient routing and forwarding token without any of the other attributes of identity, including long term persistence. The Dynamic Host configuration Protocol (DHCP) is a commonly used method of extending a fixed pool of IP addresses over a domain where not every device is connected to the network at any time, or when devices enter and leave a local network over time and need addresses only for the period where they are within the local network's domain. This has been used in LANs, ADSL, WiFi service networks and a wide variety of applications. In this form of identity, the association of the device to a particular IP address is temporary, and hence there is some weakening of the identity concept, and the dynamically-assigned IP address is being used primarily for routing and forwarding. This approach of dynamic addressing was taken a further step with the use of Network Address Translation (NAT) approaches, where an "edge" network gateway device has a pool of public addresses to use, and maps a privately used address device that is on the "inside" of the gateway to one of its public addresses when a private device initiates a session with a remote public device. The private-side device has no idea of the address that the NAT edge will use for a session, nor does the corresponding public-side device know that it is using a temporary identity association to address the private device. This approach has been further refined with the NAT Port Address translators that also use the port address field in the TCP and UDP packet headers to achieve an even high level of effective address compression. NATs, particularly port translating NATs, are very effective in a client- server network environment, where clients lie on the "internal" side of a NAT and all the well known servers lie on the "external" side. But is an environment of peer-to-peer applications, including VOIP this concept of using addresses in this way raises a number of challenging questions. Each unique session is mapped to a unique port and IP address, and sessions from multiple private sources may share a common "public" IP addresses, but differentiate themselves by having the NAT-PT unit assign port addresses such that the extended IP + port address is unique. How do you know if you are talking directly to a remote device, or talking through a NAT filter, or multiple NAT filters, or NAT-PT filters? And if you are talking through a NAT, how do you know if you are on the 'outside' or the 'inside'? What’s your "address" if you want others to be able to initiate a session with you if you are on the "inside" of a NAT? What if you have cascading NATs? These forms of changes to the original semantics of an IP address are uncomfortable changes to the concept of identity in IP, particularly in the area of NAT deployment. The widespread adoption continues to underline the concept that for an address as an identity token there is a lack of persistence, and the various forms of aliasing and dynamic translation weaken its utility as an identity system. Increasingly an IP address, in the world of IPv4, is being seen as a locality token with a very weak association with some form of identity. Of course that doesn't stop undue assumptions being made about the uniform equivalence of identity and IP address, however specious it may be in particular situations, and various forms of IP filter lists, whether they be various forms of abuse black lists or security permission lists all are evidence of this contradictory behavior of assuming that persistent identity and IP address are uniformly equivalent. Version 6 of IP is an attempt to restructure the address field using a larger span, and the 128 bits of address space represent a very large space in which to attempt to place structure. However in and of itself IPv6 still has not been able to make any significant changes to the address role within the Internet architecture. IPv6 addresses still contain the same overloaded semantics of "who", "where" and "how", and IPv6 also admits the same capability of dynamic addresses assignment. We are even witnessing the use of IPv6 NATs, so whatever benefits IPv6 may represent, in and of itself it still does not represent many major shift in the role of an address in the IP architecture. How could we change "addresses"? If we want to consider changes to the address semantics in a future Internet’s architecture then it appears that simply increasing the span of the address value range presents a weak value proposition in terms of remedies to the shortfalls of the overloaded semantics of an IP address. None of the deeper and more persistent issues relating to the overloaded address semantics are reduced through this measure and the issues relating to the scaleability of routing, mobility, application level complexity, and robustness persist. An area of investigation that presents greater levels of potential may lie in cleaving the concept of an address into distinct realms, and minimally that structural separation should reflect a distinction between endpoint identity and network location. Such an approach could embrace a relatively unstructured identity space, whose major attribute would be persistent uniqueness, and where the identity value of an object, or part thereof, could be embedded at the time of manufacture. It would also allow the deployment of a structured location space that had the capability to describe the topology of the network in a manner that was able to guide efficient local switching decisions. The challenge here is not necessarily in the devising of the characteristics of these identity spaces, but more likely to be in the definition of mapping capabilities between the two distinct identification realms. In other words how to map, in a highly efficient ad robust manner, from an identity value to a current or useable location value, and, potentially, how to perform a reverse mapping from a location to the identity of the object that is located at that position in the network. There is a considerable range of design choices that are exposed when the address-based binding of identity with location is removed. The most salient observation here is that if we want to consider some form of “improvement” to the current role of addresses in the Internet’s architecture, then there is little, if any, practical leverage to be obtained by simply increasing the size of the address field within the protocol’s data structures or altering the internal structure of the address, or even in altering the address distribution framework. Such measures are essentially meaningless in terms of making any significant impact on the semantics of the address, nor on its flexibility of utility within the IP architecture. If we want to create additional degrees of flexibility within the architecture of the network, then it would appear that we need to decouple aspects of current address semantics and in so doing we need to revisit the fundamental concepts of the Internet’s architecture. If we want identity, location and network path determination to be expressed in such a manner that are not fate-shared then we also need to bring into play additional concepts of dynamic mapping, binding security and integrity, and various forms of rendezvous mechanisms. As the original question asserts, addressing is a fundamental part of the Internet. If we want to contemplate substantive changes to the address model we are in effect contemplating substantive changes to the architecture of the resultant network, as compared to today’s Internet. Perhaps this is indeed a potentially more productive area of activity than the approach taken by IPv6, where the changes have been relatively minor and the impetus for adoption by industry has, to date, proved to be insufficient to offset against the incremental costs and perceived incremental benefits. Question 2: In designing new protocols, what lessons can be learned from the slow deployment of IPv6 to date? There are significant differences between devising an experiment that investigates various models of communications paradigms and undertaking a major revision to a mainstream communications protocol. The major reasons for the slow deployment of IPv6 today lie in both economic and public policy considerations as much as they lie in considerations of the underlying technology. The Internet’s major positive attribute was not derived any particular aspect of its underlying architecture or characteristic of its protocols. Indeed, the Internet was in many ways essentially dormant through the 1980’s, and, in terms of its architecture and protocol technology the Internet has not changed in any fundamental sense for some decades. It remains a connectionless, hop-by-hop forwarding, destination-addressed unreliable datagram delivery system with end-to-end control loop overlays to provide additional services related to resiliency, session management and performance characteristics. The major economic and social factor of the late 1980s’ and early 1990s’ when the Internet was expanding rapidly included the shift away from a highly regulated data market to a regulatory framework that allowed, and in some cases even encouraged, the proliferation of private data networks that went well beyond closed user groups based on tightly constrained bounds of common use. The prevailing regulatory regime allowed all forms of resale and service provision in a highly competitive market for data services, and the economic environment was one of considerable interest in technology and communications. This was coupled with the shift in the computing market from large scale mainframe systems into the computer as an item of consumer electronics, and a change in the nature of the information industry workforce into one that relied on intense use of IT solutions and associated networks. The attributes that the Internet bought to this emerging market was an unprecedented level of flexibility and efficiency that allowed almost any combination of underlying communications media and all forms of end devices to be amalgamated into a single cohesive IP network. The technical factors that lead to the rapid deployment of IPv4 included IPv4’s flexibility and ability to bridge across multiple underlying network media in a flexibly and cost efficient way. The economic and public policy factors included IPv4’s considerably lower unit cost due to the high carriage efficiency and foundation in open standards with open reference implementations. The policy framework of deregulating the data services market and allowing various forms of resale and competition encouraged new investors who were inclined to use innovative products and services as part of their market positioning. None of these factors are driving IPv6 deployment. IPv6 is no different to IPv4 in terms of its deployment capabilities, carriage efficiencies, security properties, or service capabilities. There is no change in the public policy regime with respect to IPv6, and no significant innovative difference in IPv6 that would provide a competitive edge to innovators in the market. An additional consideration is that IP services are now marketed in a highly contested price-sensitive market, and the revenue margins for most forms of mass-market IP services are very low. The capacity of the service industry to absorb the incremental costs associated with a dual-stack deployment of IPv6 without an associated incremental revenue stream are, to date, evidently unsupportable. The basis of this observation is that the significant impediment to IPv6 deployment is not availability of network equipment, nor the capability of end systems to support IPv6, nor the capability to roll out IPv6 support in most service providers’ IP network. The impediment for IPv6 deployment appears to be a well-grounded business case to actually do so. The expectation with IPv6 was that the increasing scarcity of IPv4 addresses would drive service providers and their customer base to IPv6 deployment. What does not appear to be factored into this expectation is that Network Address Translators (NATs) produce a similar outcome in terms of virtually extending the IPv4 address space, and, additionally, are an externalized cost to the service provider. Service providers do not have to fund NAT deployment. For the consumer the use of embedded NATs into the edge device is a zero cost solution. The marginal incremental cost of NAT functionality in the edge device is effectively zero for the consumer. So, in general, neither the consumer nor the service provider see a higher incremental cost in the use of NATs. Even at the application level the incremental cost of NATs are not uniformly visible. For traditional client-server based applications then there is no incremental cost of NATs. Even various forms of peer-to-peer applications operate through NATs. It appears that the only application that has some significant issues with NATs are VOIP applications, where the major issue is not the presence of NATs per se, but the fact that NATs has never been standardized and different NATS can behave differently. Currently it appears that the path of least resistance for the industry appears to be that of standardizing NATs, over the option of a near term migration of the entire Internet to IPv6. It is not enough to consult with industry players as to their current perceptions of future technology needs, as was the case in the design of IPv6. It is also necessary to understand how needs are actually expressed within the economics of the industry. If a technology is to be taken up by an industry, then the factors that lead to take up are variable, and are not wholly concentrated on aspects of superior performance or lower cost of deployment and operation as incremental improvements over the current situation. The factors also include the capabilities of incremental deployment, the alternation in the models of externalities, and the nature of the deployment cost, as well as the revenue model. Question 3: What will a new Internet with different architecture and protocols mean to IPv6 deployment? This is a topic that lies well into the area of speculation. The one salient observation is that infrastructure investment is a long term investment and such investments accrete strong resistance to further change. It is unlikely that the development of further communications technologies, whether its called a "new internet" or otherwise, would have any particular impact on the prospects of IPv6 deployment, positive or negative, assuming that the incremental benefits of this "new" technology were relatively marginal in nature. Any viable "new" communications technology in the context of changes to the existing Internet architectural model would once again have to demonstrate further gains in efficiency of at least one order of magnitude, or potentially two or three orders of magnitude over those achieved by existing Internet networks, and make substantive gains in the areas of support for mobility, configurability, security and performance in order to represent a serious increment in the value proposition that would induce industry deployment. Of course if a new technology were capable of offering such significant improvement in benefits, then there would be little sense in further deployment of either IPv4 or IPv6 technology. So the most appropriate response to the question is that "it depends". If such a "new" Internet with a different architecture and different protocol were in a position to offer clearly demonstrable significant improvements in the cost and benefit proposition, then the impetus for deployment would be relatively assured. On the other hand if the case for improvements in the cost and benefit were more marginal in nature then the case for deployment would be regarded as highly dubious. Disclaimer The above views do not necessarily represent the views of the Asia Pacific Network Information Centre, nor those of the Internet Society. About the Author GEOFF HUSTON holds a B.Sc. and a M.Sc. from the Australian National University. He has been closely involved with the development of the Internet for many years, particularly within Australia, where he was responsible for the initial build of the Internet within the Australian academic and research sector. He is author of a number of Internet- related books, and is currently the Chief Scientist at APNIC, the Regional Internet Registry serving the Asia Pacific region. He was a member of the Internet Architecture Board from 1999 until 2005, and served on the Board of the Internet Society from 1992 until 2001. www.potaroo.net