Network Working Group P. Tsuchiya INTERNET-DRAFT Bellcore February 1993 Pip Near-term Architecture Status of this Memo This document is an Internet Draft. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts). Internet Drafts are draft documents valid for a maximum of six months. Internet Drafts may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material or to cite them other than as a "working draft" or "work in progress." Please check the I-D abstract listing contained in each Internet Draft directory to learn the current status of this or any other Internet Draft. Abstract Pip is an internet protocol intended as the replacement for IP version 4. Pip is a general purpose internet protocol, designed to evolve to all forseeable internet protocol requirements. This specification describes the routing and addressing architecture for near-term Pip deployment. We say near-term only because Pip is designed with evolution in mind, so other architectures are expected in the future. This document, however, makes no reference to such future architectures. Pip WG, Expires Aug. 15, 1993 [Page 1] INTERNET-DRAFT Pip Near-term Arch February 1993 Table of Contents 1 Pip Architecture Overview ....................................... 4 1.1 Pip Architecture Characteristics .............................. 5 1.2 Components of the Pip Architecture ............................ 6 2 A Simple Example ................................................ 6 3 Pip Overview .................................................... 8 4 Pip Addressing .................................................. 10 4.1 Hierarchical Pip Addressing ................................... 10 4.1.1 Assignment of (Hierarchical) Pip Addresses .................. 12 4.1.2 Host Addressing ............................................. 14 4.1.3 Justification for Provider-rooted Hierarchical Pip Addresses ......................................................... 15 4.2 CBT Style Multicast Addresses ................................. 17 4.3 Class D Style Multicast Addresses ............................. 18 5 Pip IDs ......................................................... 18 6 Use of DNS ...................................................... 20 6.1 Information Held by DNS ....................................... 20 6.1.1 DNS File Structure for Pip Addresses ........................ 22 6.2 Authoritative Queries in DNS .................................. 22 7 Type-of-Service (TOS) (or lack thereof) ......................... 23 8 Routing on (Hierarchical) Pip Addresses ......................... 23 8.1 Exiting a Private Domain ...................................... 25 8.2 Intra-domain Networking ....................................... 27 9 Pip Header Server ............................................... 28 9.1 Forming Pip Headers ........................................... 29 9.2 Pip Header Protocol (PHP) ..................................... 31 9.3 Application Interface ......................................... 32 10 Routing Algorithms in Pip ...................................... 32 10.1 Routing Information Filtering ................................ 34 11 Transition ..................................................... 35 11.1 Justification for Pip Transition Scheme ...................... 36 Pip WG, Expires Aug. 15, 1993 [Page 2] INTERNET-DRAFT Pip Near-term Arch February 1993 11.2 Architecture for Pip Transition Scheme ....................... 37 11.3 Translation between Pip and IP packets ....................... 38 11.4 Translating between PCMP and ICMP ............................ 39 11.5 Translating between IP and Pip Routing Information ........... 39 11.6 Old TCP and Application Binaries in Pip Hosts ................ 39 12 Pip Address and ID Auto-configuration .......................... 41 12.1 Pip Address Prefix Administration ............................ 41 12.2 Host Pip Address Assignment .................................. 42 12.3 Host Pip ID Assignment ....................................... 43 13 Pip Control Message Protocol (PCMP) ............................ 43 14 Host Mobility .................................................. 46 14.1 PCMP Mobile Host message ..................................... 47 14.2 Spoofing Pip IDs ............................................. 48 15 Public Data Network (PDN) Address Discovery .................... 49 15.1 Notes on Carrying PDN Addresses in NSAPs ..................... 50 16 Evolution with Pip ............................................. 51 16.1 Handling Directive (HD) and Routing Context (RC) Evolution 16.1.1 Options Evolution .......................................... 55 Pip WG, Expires Aug. 15, 1993 [Page 3] INTERNET-DRAFT Pip Near-term Arch February 1993 Introduction Pip is an internet protocol intended as the replacement for IP version 4. Pip is a general purpose internet protocol, designed to handle all forseeable internet protocol requirements. This specification describes the routing and addressing architecture for near-term Pip deployment. We say near-term only because Pip is designed with evolution in mind, so other architectures are expected in the future. This document, however, makes no reference to such future architectures (except in that it discusses Pip evolution in general). This document gives an overall picture of how Pip operates. It is provided primarily as a framework within which to understand the total set of documents that comprise Pip. The Pip Overview document [1] generally describes the Pip header and its capabilities outside the context of any given ROAD architecture (or rather, in the context of several ROAD architectures). It describes what is possible with Pip. This document describes what is done with Pip near-term. This document assumes an understanding of the basic Pip protocol (that is, it assumes that [1] and possibly [3] have been read). 1. Pip Architecture Overview The Pip near-term architecture is an incremental step from IP. Like IP, near-term Pip is datagram. Pip runs under TCP and UDP. DNS is used in the same fashion it is now used to distribute name to Pip Address (and ID) mappings. (Note that in addition to functioning as a global identifier, the Pip ID can function as the lowest level of the Pip Address, and as a multicast ID.) Routing in the near-term Pip architecture is hop-by-hop, though it is possible for a host to create a source route (for policy reasons). Pip Addresses has more hierarchy than IP, thus improving scaling on one hand, but introducing additional addressing problems, such as multiple addresses, on the other. Pip, however, uses hierarchical addresses to advantage by making the provider-based, and using them to make policy routing (in this case, provider selection) choices. Pip also provides mechanisms for automatically assigning provider prefixes to hosts and routers in domains. This is the main Pip WG, Expires Aug. 15, 1993 [Page 4] INTERNET-DRAFT Pip Near-term Arch February 1993 difference between the Pip near-term architecture and the IP archi- tecture. (Note that in the remainder of this paper, unless otherwise stated, the phrase "Pip architecture" refers to the near-term Pip architecture.) 1.1. Pip Architecture Characteristics The proposed architecture for near-term Pip has the following charac- teristics: 1. Provider-rooted hierarchical addresses. 2. Automatic address prefix assignment. 3. Exit provider selection. 4. Multiple defaults routing (default routing, but to multiple exit points). 5. Equivalent of IP Class D style addressing for multicast. 6. CBT style multicast. 7. Providers support forwarding on policy routes (but initially will not provide the support for sources to calculate policy routes). 8. Mobile hosts. 9. Support for routing across large Public Data Networks (PDN). 10. Inter-operation with IP hosts (but, only within an IP-address domain where IP addresses are unique). In particular, an IP address can be explicitly carried in a Pip header. 11. Operation with existing transport and application binaries (though if the application contains IP context, like FTP, it may only work within a domain where IP addresses are unique). 12 Mechanisms for evolving Pip beyond the near-term architecture. Pip WG, Expires Aug. 15, 1993 [Page 5] INTERNET-DRAFT Pip Near-term Arch February 1993 1.2. Components of the Pip Architecture The Pip Architecture consists of the following five systems: 1. Host (source and sink of Pip packets) 2. Router (forwards Pip packets) 3. DNS 4. Pip/IP Translator 5. Pip Header Server (formats Pip headers) The first three systems exist in the IP architecture, and require no explanation here. The fourth system, the Pip/IP Translator, is required solely for the purpose of inter-operating with current IP systems. The fifth system, the Pip Header Server, is new. Its function is to format Pip headers on behalf of the source host (though initially hosts will be able to do this themselves). This use of the Pip Header Server will increase as policy routing becomes more sophisti- cated (moves beyond near-term Pip Architecture capabilities). To handle future evolution, a Pip Header Server can be used to "spoon-feed" Pip headers to old hosts that have not been updated to understand new uses of Pip. This way, the probably that the internet can evolve without changing all hosts is increased. Finally, it is expected that the Pip/IP translation function resides in the same systems as the Pip routers, and that all Pip routers are capable of Pip/IP translation. 2. A Simple Example A typical Pip "exchange" is as follows: An application initiates an exchange with another host as identified by a domain name. A request for one or more Pip Headers, containing the domain name of the desti- nation host, goes to the Pip Header Server. The Pip Header Server Pip WG, Expires Aug. 15, 1993 [Page 6] INTERNET-DRAFT Pip Near-term Arch February 1993 generates a DNS request, and receive back a Pip ID, multiple Pip Addresses, and possibly other information such as a mobile host server or a PDN address. Given this information, plus information about the source host (its Pip Addresses, for instance), plus option- ally policy information, plus optionally topology information, the Pip Header Server formats an ordered list of valid Pip headers and give these to the host. (Note that if the Pip Header Server is co- resident with the host, as will be common initially, the host behavior is similar to that of an IP host in that a DNS request comes from the box, and the host forms a Pip header based on the answer from DNS.) The source host then begins to transmit Pip packets to the destina- tion host. If the destination host is an IP host, then the Pip packet is translated into an IP packet along the way. Assuming that the destination host is a Pip host, however, the destination host uses the destination Pip ID alone to determine if the packet is des- tined for it. The destination host generates a return Pip header based either on information in the received Pip header, or the desti- nation host uses the Pip ID of the source host to query the Pip Header Server/DNS itself. The latter case involves more overhead, but allows a more informed decision about how to return packets to the originating host. If either host is mobile, and moves to a new location, thus getting a new Pip Address, it informs the other host of its new address directly. Since host identification is based on the Pip ID and not the Pip Address, this doesn't cause transport level to fail. If both hosts are mobile and receive new Pip Addresses at the same time (and thus cannot exchange packets at all), then they can query each other's respective mobile host servers (learned from DNS). Note that host mobility is completely confined to hosts (and DNS). Routers never get involved in tracking mobile hosts (though naturally they are involved in host discovery and automatic host address assign- ment). Though DNS quickly flushes cached information, the Pip Header Server can keep cached information arbitrarily long. If after a long time the cached information is no longer valid, this is learned when attempted communication to a destination fails, and the information is flushed at that time. Pip WG, Expires Aug. 15, 1993 [Page 7] INTERNET-DRAFT Pip Near-term Arch February 1993 3. Pip Overview Here, a brief overview of the Pip protocol is given. The reader is encouraged to read [3] for a complete description. The Pip header is divided into four parts: Initial Part Transit Part Options Part Host Part The Initial Part contains the following fields: Version Number Host Offset Options Offset Hop Count Dest ID The Version Number places Pip as a subsequent version of IP. The Host Offset tells how to find the Host Part (for fast processing by receiving hosts). The Options Offset tells how to find the Options Part. The Hop Count is similar to IP's Time-to-Live. The Dest ID identifies the des- tination host, and is not used for routing, except for where the final router on a LAN uses ARP to find the physical address of the host iden- tified by the dest ID. The Transit Part contains the following fields: Options Present FTIF Chain Length Handling Directive Routing Context FTIF Offset FTIF Chain (FTIF = Forwarding Table Index Field) The Options Present field indicates which options are in the Options Part. This allows a host or router to selectively ignore options. The FTIF Chain Length simply indicates how long the FTIF Chain is so that the next Transit Part can be found (FTIF means Forwarding Table Index Field). Pip WG, Expires Aug. 15, 1993 [Page 8] INTERNET-DRAFT Pip Near-term Arch February 1993 The Handling Directive is a set of subfields, each of which indicates a specific handling action that must be executed on the packet. Handling directives have no influence on routing. The Handling Directive itself is preceded by a field that indicates what subfields are in the Handling Directive. This allows the definition of the set of handling directives to evolve over time. Example handling directives are queueing priority, congestion experienced bit, drop priority, and so on. The Routing Context and FTIF Chain comprise the Routing Directive. This is where the routing decision gets made. The basic algorithm is that the router uses the Routing Context to choose one of multiple forwarding tables. The FTIF Offset is used to retrieve and FTIF, which is then used as an index into the forwarding table, which either instructs the router to look at the next FTIF, or returns the forwarding information. Examples of Routing Context uses are; to distinguish address families (multicast vs. unicast), to indicate which level of the hierarchy a packet is being routed at, and to indicate a Type of Service. In the near term architecture, the FTIF Chain is used to carry source and des- tination hierarchical unicast addresses, policy route fragments, and multicast addresses. The Routing Context is preceded by a field that indicates what subfields are in the Routing Context. This allows the definition of the Routing Context to evolve over time. The Options Part contains the options. The options are preceded by an array of 8 fields that gives the offset of each of up to 8 options. Thus, a particular option can be found without a serial search of the list of options. The Host Part contains the following fields: Payload Length Protocol Source ID Packet SubID Host Version The Payload Length and Protocol take the place of IP's Total Length and Protocol fields respectively. The Source ID identifies the source of the packet. The Packet SubID is used to relate a received PCMP message to a previously sent Pip packet. This is necessary because, since routers in Pip can tag packets, the packet returned to a host in a PCMP message may not be the same as the packet sent. The Host Version tells what control algorithms the host has implemented, so that routers can respond to hosts appropriately. This is an evolution mechanism. Pip WG, Expires Aug. 15, 1993 [Page 9] INTERNET-DRAFT Pip Near-term Arch February 1993 4. Pip Addressing Addressing is the core of any internet architecture. Pip Addresses are carried in the Routing Directive (RD) of the Pip header (except for the Pip ID, which in limited circumstances functions as part of the Pip Address). Pip Addresses are used only for routing packets. They do not serve the function of identifying the source and destina- tion of a Pip packet. The Pip ID does this. Here we describe and justify the Pip Addressing schemes There are 3 Pip Addressing schemes. The hierarchical Pip Address (referred to simply as the Pip Address) is used for scalable unicast and for the unicast part of a CBT-style multicast. The multicast part of a CBT-style multicast is the second Pip Address type. The third Pip Address type is class-D style multicast. 4.1. Hierarchical Pip Addressing The primary purpose of a hierarchical address is to allow better scaling of routing information, though Pip also uses the "path" information latent in hierarchical addresses for making policy rout- ing decisions. The Pip Header encodes addresses as a series of separate numbers, one number for each level of hierarchy. This can be contrasted to tradi- tional packet encodings of addresses, which lump everything into one field. Because of Pip's encoding, it is not necessary to specify a format for a Pip Address as it is with traditional addresses (for instance, the SIP address is formatted such that the first so-many bits are the country/metro code, the next so-many bits are the site/subscriber, and so on). Pip's encoding also eliminates the "cornering in" effect of running out of space in one part of the hierarchy even though there is plenty of room in another. No "field sizing" decisions need be made at all with Pip Addresses. Pip Addresses are carried in DNS as a series of numbers, usually with each number representing a layer of the hierarchy [2], but optionally with the initial number(s) representing a "route fragment" (the tail end of a source route whose elements are providers). The route frag- ments would be used, for instance, when the destination network's directly attached provider is only giving access to other providers, Pip WG, Expires Aug. 15, 1993 [Page 10] INTERNET-DRAFT Pip Near-term Arch February 1993 but the important provider-selection policy decision has to do the the other providers. Thus, DNS carries the "down" and "none" markers needed for Pip forwarding to distinguish how to route packets [3]. Each number can be up to 32 bits in length, including the markers (though it should be understood that a number higher than 65535 (or, 2^16 - 1) increases the size of the Pip Header because 32 bits is required to carry each address field rather than 16). Note that Pip Addresses do not need to be seen by protocol layers above Pip (though layers above Pip can provide a Pip Address if desired). Transport and above can use the Pip ID to identify the source and destination of a Pip packet. The Pip layer knows how to map the Pip IDs (and other information received from the layer above, such as QOS) into Pip Addresses. The Pip ID can serve as the lowest level of a Pip Address. While this "bends the principal" of separating Pip Addressing from Pip Identification, it greatly simplifies address administration. The Pip ID also serves as a multicast ID. Unless otherwise stated, the term "Pip Address" refers to just the part in the Routing Directive (that is, excludes the Pip ID). Pip Addresses are provider-rooted (as opposed to geographical). That is, the top-level of a Pip Address indicates a network service pro- vider (even when the service provided is not Pip). (A justification of using provider-rooted rather than geographical addresses is given in section 11.1.) Thus, the basic form of a Pip address is: providerPart,subscriberPart where both the providerPart and subscriberPart can have multiple layers of hierarchy internally. A subscriber may be attached to multiple providers. In this case, a host can end up with multiple Pip Addresses by virtue of having mul- tiple providerParts: providerPart1,subscriberPart providerPart2,subscriberPart providerPart3,subscriberPart Note that, while there are three providerParts shown here, there is Pip WG, Expires Aug. 15, 1993 [Page 11] INTERNET-DRAFT Pip Near-term Arch February 1993 only one subscriberPart. Internal subscriber numbering should be independent of the providerPart. Indeed, with the Pip architecture, it is possible to address internal packets without including any of the providerPart of the address. This applies to the case where the subscriber network spans many dif- ferent provider areas, for instance, a global corporate network. In this case, some hosts in the global corporate network will have cer- tain providerParts, and other hosts will have others. The subscri- berPart should be assigned such that routing can successfully take place without a providerPart in the destination Pip Address of the Pip Routing Directive (see section 8.2). 4.1.1. Assignment of (Hierarchical) Pip Addresses Administratively, Pip Addresses are assigned as follows [4]. There is a root Pip Address assignment authority. Likely choices for this are IANA or ISOC. The root authority assigns top-level Pip Address numbers. (A "Pip Address number" is the number at a single level of the Pip Address hierarchy. A Pip Address prefix is a series of con- tiguous Pip Address numbers, starting at the top level but not including the entire Pip Address. Thus, the top-level prefix is the same thing as the top-level number.) Though by-and-large top-level assignments are made to providers, each country is given an assignment, and each existing address space (such as E.164, X.121, IP, etc.) is given an assignment. Thus, existing addresses can be grandfathered in. Even if the top-level Pip address number is an administrative rather than topological assignment, the routing algorithm still advertises providers at the provider level of routing.. That is, routing will advertise enough levels of hierarchy that providers know how to route to each other. There must be some means of validating top-level number requests. That is, top-level assignments must be made only to true providers. While designing the best way to do this is outside the scope of this document, it seems off hand that a reasonable approach is to charge for the top-level prefixes. The charge should be enough to discourage non-serious requests for prefixes, but not so much that it becomes an inhibitor to entry in the market. The charge should include a yearly "rent", and top-level prefixes should be reclaimed when they are no longer used by the provider. Any profit made from Pip WG, Expires Aug. 15, 1993 [Page 12] INTERNET-DRAFT Pip Near-term Arch February 1993 this activity could be used to support the overall role of number assignment. Since roughly 16,000 top-level assignments can be made before having to increase the FTIF size in the Pip header from 16 bits to 32 bits, it is envisioned that top-level prefixes will not be viewed as a scarce resource. After a provider obtains a top-level prefix, it becomes an assignment authority with respect to that particular prefix. The provider has complete control over assignments at the next level down (the level below the top-level). The provider may either assign top-level minus one prefixes to subscribers, or preferably use that level to provide hierarchy within the provider's network (for instance, in the case where the provider has so many subscribers that keeping routing information on all of them creates a scaling problem). As mentioned in section 11.1, this provider-internal layer of hierarchy also improves paths found between providers. It is envisioned that the subscriber will have complete control over number assignments made at levels below that of the prefix assigned it by the provider. Assigning top level prefixes directly to providers leaves the number of top-level assignments open-ended, resulting in the possibility of scaling problems at the top level. While it is expected that the number of providers will remain relatively small (less than 10000 globally), this can't be guaranteed. If there are more providers than top-level routing can handle, it is likely that many of these providers will be "local access" providers--providers whose role is to give a subscriber access to multiple "long-distance" providers. In this case, the local access providers need not appear at the top level of routing, thus mitigating the scaling problem at that level. In the worst case, if there are too many top-level "long-distance" providers for top-level routing to handle, a layer of hierarchy above the top-level can be created. This layer should probably conform to some policy criteria (as opposed to a geographical criteria). For instance, backbones with similar access restrictions or type-of- service can be hierarchically clustered. Clustering according to policy criteria rather than geographical allows the choice of address to remain an effective policy routing mechanism. Of course, adding a layer of hierarchy to the top requires that all systems, over time, obtain a new providerPart prefix. Since Pip has automatic prefix assignment, and since DNS hides addresses from users, this is not a debilitating problem. Pip WG, Expires Aug. 15, 1993 [Page 13] INTERNET-DRAFT Pip Near-term Arch February 1993 4.1.2. Host Addressing Hosts can have multiple Pip Addresses. Since Pip Addresses are topo- logically significant, a host has multiple Pip Addresses because it exists in multiple places topologically. For instance, a host can have multiple Pip addresses because it can be reached via multiple providers, or because it has multiple physical interfaces. The address used to reach the host influences the path to the host. Locally, Pip Addressing is similar to IP Addressing. That is, Pip prefixes are assigned to subnetworks (where the term subnetwork here is meant in the OSI sense. That is, it denotes a network operating at a lower layer than the Pip layer, for instance, a LAN). Thus, it is not necessary to advertise individual hosts in routing updates-- routers only need to advertise and store routes to subnetworks. Unlike IP, however, a single subnetwork can have multiple prefixes. (Strictly speaking, in IP a single subnetwork can have multiple pre- fixes, but a host may not be able to recognize that it can reach another host on the same subnetwork but with a different prefix without going through a router.) There are two styles of local Pip Addressing--one where the Pip Address denotes the host, and another where the Pip Address denotes only the destination subnetwork. The latter style is called ID- tailed Pip Addressing. With ID-tailed Pip Addresses, the Pip ID is used by the last router to forward the packet to the host. It is expected that ID-tailed Pip Addressing is the most common, because it greatly eases address administration. (Note that the Pip Routing Directive can be used to route a Pip packet internal to a host. For instance, the RD can be used to direct a packet to a device in a host, or even a certain memory loca- tion. The use of the RD for this purpose is not part of this near- term Pip architecture. We note, however, that this use of the RD could be locally done without effecting any other Pip systems.) When a router receives a Pip packet and determines that the packet is destined for a host on one of its' attached subnetworks (by examining the Routing Directive (RD)), it then examines the destination Pip ID (which is in a fixed position) and forwards based on that. If it does not know the subnetwork address of the host, then it ARPs, using the Pip ID as the "address" in the ARP query. Pip WG, Expires Aug. 15, 1993 [Page 14] INTERNET-DRAFT Pip Near-term Arch February 1993 4.1.3. Justification for Provider-rooted Hierarchical Pip Addresses Hierarchical addresses work best--that is, scale the best while minimizing the increased path length that results from hiding information--when the address structure matches the topology. Though there are many non-hierarchical connections in the Internet, the internet is basically hierarchical. In particular, the Internet has a predominant provider/subscriber topology. Therefore, Pip uses hierarchical addresses that are topological, and reflect the provider/subscriber topology. The top level of the Pip Address is, to the extent possible, rooted at providers. (I say to the extent possible because Pip allows levels of hierarchy that are purely administrative in nature, such as a country code above the provider. This type of assignment may occasionally be politically necessary, though it is to be avoided.) Note that the provider network does not have to be a Pip network per se (that is, it does not have to be able to switch Pip packets) in order to justify having a Pip top-level number. Examples of such networks are SMDS, Frame Relay, ATM, and so on. Since many of the subscribers attached to such a provider will run Pip, it is necessary for scaling to hierarchically cluster these subscribers around a Pip top-level number given to the provider. Pip routers attached to the provider network can advertise the Pip Address of the provider. The alternative to topological (provider-rooted) addresses are geo- graphical addresses. Geographical addresses are inferior to provider-rooted addresses because 1) they place artificial require- ments on physical connectivity (different providers are required to interconnect at geographical locales), and 2) they don't scale as well as provider-based addresses. In addition, the assignment of geographical addresses requires an authoritative address assignment authority, which currently does not exist. (To be fair, provider- based addresses also require an address assignment authority, but it doesn't have to be nearly as authoritative.) Geographical addresses have an advantage over provider-rooted addresses in that they decouple the subscriber from provider orienta- tion. Thus, address administration is easier with geographical addresses. This is particularly true for subscribers that are attached to only one provider. When such a subscriber changes pro- viders in the same geographical area, it suffers no address prefix Pip WG, Expires Aug. 15, 1993 [Page 15] INTERNET-DRAFT Pip Near-term Arch February 1993 change. Subscribers that are connected to multiple providers, and want to be able to choose which provider to use on, say, a per-connection basis, do not gain as much from geographical addresses. This is because multiply-connected subscribers must have some method of indicating which providers they are attached to, so that packets can be directed to the desired provider. This method invariably involves distribu- tion through some mechanism, for example DNS, and carriage in the internet packet header. Administering this information is roughly the same as administering the upper portions of a provider-rooted hierarchical address. One of the arguments for geographical addresses is that it allows one provider to know the best entry point to another provider. For instance, two providers that connect to each other in two places--one on the east coast of the USA and one on the west coast. With geo- graphical addresses, the first provider knows which coast to enter the other provider at. There is a fallacy to this argument, though. That is, the geographi- cal addresses only happen to be useful in this case because the internal topologies of the two providers just happen (or were forced) to correspond to geography. Thus, what is working in this example is the fact that the addresses are *topologically* oriented, not geo- graphically oriented. The same effect could have been achieved by having the two providers structure themselves hierarchically internally (according to whatever made sense in the context of their respective topologies), and adver- tise via a routing protocol which of their internal areas were best reached via each connection point. In fact, this method has the advantage that a provider has the choice of entering its neighbor provider near the source or near the destination. Because each pro- vider has a relatively small number of internal areas (say a few hun- dred or less), the amount of information that has to be exchanged is small. Note that the internal hierarchy of a provider is probably necessary anyway for internal scaling. Thus, the hierarchical structure of a Pip address should be: provider.providerArea.subscriber where the subscriber part may of course have multiple levels of Pip WG, Expires Aug. 15, 1993 [Page 16] INTERNET-DRAFT Pip Near-term Arch February 1993 hierarchy. We say "should be" rather than "is" because each provider is free to structure their addresses as they wish internally. The provider can choose not to add the level of hierarchy between pro- vider and subscriber. (Note that Pip IDs, which are also hierarchical, although they are treated as flat in a Pip header, are not provider-rooted. Top-level Pip ID numbers are assigned, to the extent possible, directly to private organizations.) (Note that the issue of geographical versus provider-rooted addresses is currently under debate in the internet. I feel strongly that geo- graphical addresses make little sense, and will push for provider rooted addresses, limiting Pip's near-term architecture to provider- rooted addresses only. This having been said, there is nothing about Pip that prevents geographical address assignment, and if the inter- net community ends up showing clear consensus towards geographical addresses or a coexistence of geographical and provider-rooted, then Pip can be made to do this.) 4.2. CBT Style Multicast Addresses With CBT (Core-based Tree) multicast, there is a single multicast tree connecting the members (recipients) of the multicast group (as opposed to Class-D style multicast, where there is a tree per source). The tree emanates from a single "core" router. To transmit to the group, a packet is routed to the core using unicast routing. Once the packet reaches a router on the tree, it is multicast using a group ID. Thus, the FTIF Chain for CBT multicast contains the (Unicast) Hierarchical Pip Address of the core router. The Dest ID field con- tains the group ID. The "address family" in the Routing Context field is set to CBT Multicast. Another bit in the Routing Context is defined as the "on-tree" bit. When the packet is transmitted by a host, the on-tree bit is set to 0. Routers receiving this packet route based on the core address. When a packet reaches a router on the tree, the on-tree bit is set to 1, and subsequent routers don't examine the FTIF Chain at all, but instead route only on the group ID. Pip WG, Expires Aug. 15, 1993 [Page 17] INTERNET-DRAFT Pip Near-term Arch February 1993 4.3. Class D Style Multicast Addresses By "class D" style multicast, we mean multicast using the algorithms developed for use with Class D addresses in IP (class D addresses are not used per se). This style of routing uses both source and desti- nation information to route the packet (source host address and des- tination multicast group). For Pip, the FTIF Chain holds the source ID, encoded as two 32-bit fields, with the low-order field first. The Dest ID field holds the multicast group. The Routing Context indicates Class-D style multicast. All routers must first look at the FTIF Chain (though usually only the first FTIF) and Dest ID field to route the packet on the tree. 5. Pip IDs The Pip ID is 64-bits in length. The basic role of the Pip ID is to identify the source and destina- tion host of a Pip Packet. (The other role of the Pip ID is for allowing a router to find the destination host on the destination subnetwork.) This having been said, it is possible for the Pip ID to ultimately identify something in addition to the host. For instance, the Pip ID could identify a user or a process. For this to work, however, the Pip ID has to be bound to the host, so that as far as the Pip layer is concerned, the ID is that of the host. Any additional use of the Pip ID is outside the scope of this Pip architecture. The Pip ID is treated as flat. When a host receives a Pip packet, it compares the destination Pip ID in the Pip header with its' own. If there is a complete match, then the packet has reached the correct destination, and is sent to the higher layer protocol. If there is not a complete match, then the packet is discarded, and a PCMP Invalid Address packet is returned to the originator of the packet [8]. Even though the Pip ID is treated as flat by the Pip layer, it is generally hierarchically assigned (some flat assignments are avail- able). The Pip ID hierarchy follows organizational boundaries. The Pip ID hierarchy bears no relationship to topology. Pip WG, Expires Aug. 15, 1993 [Page 18] INTERNET-DRAFT Pip Near-term Arch February 1993 Flat assignment of Pip IDs has the advantages of 1) using the ID space efficiently, and 2) simple number administration. Hierarchical Pip ID assignment has any number of advantages over flat assignment. These include allowing the Pip ID to be used for inverse DNS lookups and allowing a Pip packet to be associated with an organization. (Note that the use of the Pip ID alone for this purpose can be easily spoofed. By cross checking the Pip ID with the Pip Address prefix, spoofing is harder--as hard as it is with IP--but still easy. Sec- tion 14.2 discusses methods for making spoofing harder still, without requiring encryption.) Administratively, the assignment of Pip IDs operates similar to that of Pip Addresses. The top-level Pip ID assignment authority can be the same as that for Pip Addresses. Instead of assigning top-level Pip ID numbers to providers, however, they are to the extent possible assigned directly to organizations. To the extent that organizational content is useful in a Pip ID, direct assignment of top-level Pip ID numbers to organizations maxim- izes the information content in a Pip ID. This is important, because the Pip ID is relatively small, and layers of assignment that do not contain organizational information greatly reduce the amount of space left for organizational information. This having been said, some top-level Pip ID numbers are reserved for countries, which can subsequently assign 2nd-level Pip ID numbers to organizations. Top-level Pip ID numbers are also reserved for exist- ing numbering spaces, such as IP, IEEE 802, and E.164. Finally, top-level numbers are reserved for such special purposes such as "any host", "any router", "all hosts on a subnetwork", "all routers on a subnetwork", and so on. The maximum value of a Pip ID number (the number at a single level of the assignment hierarchy) is limited only by the amount of space left in the Pip ID. Thus, a top-level assignment can consume the entire 64-bit Pip ID (as is the case with the special purpose assignments "any host" etc.). The Pip ID is encoded in the Pip header such that the hierarchical content of the Pip ID is self-describing. In order to make the Pip ID self-describing while allowing any level of the Pip ID to be almost arbitrarily large, a modified ASN.1 notation is used to encode the Pip ID [5]. One reason for the modification is due to the fact that the Pip ID is fixed length, whereas ASN.1 numbers are variable length. The modified ASN.1 notation also results in the bits of the Pip ID being strictly left-to-right signi- ficant. Pip WG, Expires Aug. 15, 1993 [Page 19] INTERNET-DRAFT Pip Near-term Arch February 1993 Another reason for the modification is that it is desirable to encode the IP address in the Pip ID as a straight 32-bit number. IP addresses in Pip IDs are always in the lower 32 bits, and are distin- guishable by a particular 8-bit escape value preceding it. 6. Use of DNS The Pip near-term architecture uses DNS in roughly the same style that it is currently used. In particular, the Pip architecture main- tains the two fundamental DNS characteristics of 1) information stored in DNS does not change often, and 2) the information returned by DNS is independent of who requested it. While the fundamental use of DNS remains roughly the same, Pip's use of DNS differs from IP's use by degrees. First, Pip relies on DNS to hold more types of information than IP [2]. Second, Pip Addresses in DNS are expected to change more often than IP addresses, due to reas- signment of External Prefixes. To still allow aggressive caching of DNS records in the face of more quickly changing addressing, Pip modifies DNS so that queries can be authoritative, resulting both in 1) a query going to the authoritative source of a DNS record, and 2) the caches being overwritten with a new record. Pip uses a new PCMP message type, the Invalid Address, to determine when a cached DNS record might be invalid, thus triggering an authoritative query. In what follows, we first discuss the information contained in DNS, and then discuss authoritative queries. 6.1. Information Held by DNS The information contained in DNS for the Pip architecture is: 1. The Pip ID. 2. Multiple Pip Addresses 3. The destination's mobile host address servers. 4. The Public Data Network (PDN) addresses through which the Pip WG, Expires Aug. 15, 1993 [Page 20] INTERNET-DRAFT Pip Near-term Arch February 1993 destination can be reached. 5. The Pip/IP Translators through which the destination (if the destination is IP-only) can be reached. 6. Information about the providers represented by the destination's Pip addresses. This information includes provider name, the type of provider network (such as SMDS, ATM, or SIP), access restrictions on the provider's network, and TOSs available by the provider. (As of this writing, no TOSs are defined.) The Pip ID and Addresses are the basic units of information required for carriage of a Pip packet. The mobile host address server tells where to send queries for the current address of a mobile Pip host. Note that usually the current address of the mobile host is conveyed by the mobile host itself, without involving the mobile host server. The PDN address is used by the entry router of the PDN to learn the PDN address of the next hop router. The entry router obtains the PDN address via an option in the Pip packet. Note that the option is not sent on every packet. The Pip/IP translator information is used to know how to translate an IP address into a Pip Address so that the packet can be carried across the Pip infrastructure. If the originating host is IP, then the first IP/Pip translator reached by the IP packet must query DNS for this information. The information about the destination's providers is used to help the "source" (either the source host or a Pip Header Server near the source host) format an appropriate Pip header, especially with regards to choosing a Pip Address, but also with regards to setting TOS information. The choice of one of multiple Pip Addresses is essentially a policy routing choice. More detailed descriptions of the use of the information carried in DNS is contained in the relevant sections. Pip WG, Expires Aug. 15, 1993 [Page 21] INTERNET-DRAFT Pip Near-term Arch February 1993 6.1.1. DNS File Structure for Pip Addresses Even though the Pip Address is returned in a DNS record as a simple series of numbers, the files in DNS are structured such that the natural number groupings that make up the address (provider part, private part) are distinguished. This allows the DNS administrator to easily change the prefix of a large number of hosts' Pip Addresses, for instance because of subscribing to a new provider's service. 6.2. Authoritative Queries in DNS The Pip architecture provides a method for a host to determine if a DNS entry has become stale. That method is to configure into routers information about which Pip Addresses are valid and which are not valid, and of the valid ones, to further configure which Pip ID pre- fixes should be reachable through which corresponding Pip Address prefixes. When an address is determined to be invalid, a PCMP Invalid Address message is returned to the host, indicating that the cached DNS information is no longer valid, and that an authoritative query made (resulting in the cache being updated). An example of this use is as follows. A provider X has assigned an External Prefix to a given subscriber. The subscriber decides to switch to another provider Y. Provider X then invalidates the subscriber's External Prefix for a period of time adequate for all DNS entries to age naturally (that is, according to the Time-to-Live parameter in DNS). The routers in provider X are updated to indicate that the External Prefix is no longer valid. If a host with an old DNS cache sends a packet to the subscriber's old External Prefix, a router in provider X's network will determine that it is undeliver- able, and further determine that it is for an invalid address. The router will send the host a PCMP Invalid Address message, causing the host to make an authoritative query. As another example, assume the same scenario, but that the External Prefix has been reassigned to another subscriber before all DNS caches have naturally timed out. In this case, when a host sends a packet with a Pip Address containing the External Prefix, but a Pip ID for one of the original subscriber's hosts, the packet will be routed to the new subscriber's network, where it will eventually Pip WG, Expires Aug. 15, 1993 [Page 22] INTERNET-DRAFT Pip Near-term Arch February 1993 reach a router that cannot deliver it. Upon not being able to for- ward the packet, the router will check the Pip ID and determine that it is not meant for any host in its organization, and send the PCMP Invalid Address message. (Note that, as a security check, a previous router, such as a border router, may have filtered against the desti- nation Pip ID and made the same determination.) The modification to DNS to make queries authoritative is straightfor- ward. The authoritative bit is set in the query, resulting in the DNS server receiving the query to not look into its cache. All other DNS behavior remains the same. Note that, if for some reason unknown to the author it is not accept- able to form authoritative queries, a DNS resolver can mimic an authoritative query by first determining the address of the authori- tative name server, and then querying that name server directly. This method, while acceptable, is less desirable in that it doesn't result in flushing caches. 7. Type-of-Service (TOS) (or lack thereof) One year ago it probably would have been adequate to define a handful (4 or 5) of priority levels to drive a simple priority FIFO queue. With the advent of real-time services over the Internet, however, this is no longer sufficient. Real-time traffic cannot be handled on the same footing as non-real-time. In particular, real-time traffic must be subject to access control so that excess real-time traffic does not swamp a link (to the detriment of other real-time and non- real-time traffic alike). Given that a consensus solution to real- and non-real-time traffic management in the internet does not exist, this version of the Pip near-term architecture does not specify any classes of service (and related queueing mechanisms). It is expected that Pip will define classes of service (primarily for use in the Handling Directive) as solutions become available. 8. Routing on (Hierarchical) Pip Addresses Pip forwarding in a single router is done based on one or a small Pip WG, Expires Aug. 15, 1993 [Page 23] INTERNET-DRAFT Pip Near-term Arch February 1993 number of FTIFs. What this means with respect to hierarchical Pip Addresses is that a Pip router is able to forward a packet based on examining only part of the Pip Address--often a single level. One advantage to encoding each level of the Pip Address separately is that it makes handling of addresses, for instance address assignment or managing multiple addresses, easier. Another advantage is address lookup speed--the entire address does not have to be examined to for- ward a packet (as is necessary, for instance, with traditional hierarchical address encoding). The cost of this, however, is addi- tional complexity in keeping track of the active hierarchical level in the Pip header. Since Pip Addresses allow reuse of numbers at each level of the hierarchy, it is necessary for a Pip router to know which level of the hierarchy it is acting at when it retrieves an FTIF. This is done in part with a hierarchy level indicator in the Routing Context (RC) field. RC Level is numbered from the top of the hierarchy down. Therefore, the top of the hierarchy is RC Level = 0, the next level down is RC Level = 1, and so on. The RC Level alone, however, is not adequate to keep track of the appropriate level in all cases. This is because different parts of the hierarchy may have different numbers of levels, and elements of the hierarchy (such as a domain or an area) may exist in multiple parts of the hierarchy. Thus, a hierarchy element can be, say, level 3 under one of its parents and level 2 under another. To resolve this ambiguity, the topological hierarchy is superimposed with another set of levels--metalevels. A metalevel boundary exists wherever a hierarchy element has multiple parents with different numbers of levels, or may with reasonable probability have multiple parents with different numbers of levels in the future. Thus, a metalevel boundary exists between a private domain and its provider. (Note that in general the metalevel represents a signifi- cant administrative boundary between two levels of the topological hierarchy. It is because of this administrative boundary that the child is likely to have multiple parents.) Lower metalevels may exist, but usually will not. The RC, then, contains a level and a metalevel indicator. The level indicates the number of levels from the top of the next higher metalevel. The top of the global hierarchy is metalevel 0, level 0. The next level down (for instance, the level that provides a level of Pip WG, Expires Aug. 15, 1993 [Page 24] INTERNET-DRAFT Pip Near-term Arch February 1993 hierarchy within a provider) is metalevel 0, level 1. The first level of hierarchy under a provider is metalevel 1, level 0, and so on. To determine the RC Level and RC Metalevel in a transmitted Pip packet, a host (or Pip Header Server) must know where the metalevels are in its own Pip Addresses. The host compares its source Pip Address with the destination Pip Address. The highest Pip Address component that is different between the two addresses determines the level and metalevel. (No levels higher than this level need be encoded in the Routing Directive.) Neighbor routers are configured to know if there is a level or metalevel boundary between them, so that they can modify the RC Level and RC Metalevel in a transmitted packet appropriately. 8.1. Exiting a Private Domain The near-term Pip Architecture provides two methods of exit routing, that is, routing inter-domain Pip packets from a source host to a network service provider of a private domain. In the first method, called transit-driven exit routing, the source host leaves the choice of provider to the routers. In the second method, called host-driven exit routing, the source host explicitly chooses the provider. In either method, it is possible to prevent internal routers from having to carry external routing information. With host-driven exit routing, it is possible for the host to choose a provider through which the destination cannot be reached. In this case, the host receives the appropriate PCMP Destination Unreachable message, and may either fallback on transit-driven exit routing or choose a different provider. When using host-driven exit routing, the host sets the FTIF Offset field to point to the top-level FTIF of the source Pip Address (that is, the part of the source address that represents the provider). The host sets the Level and Metalevel parts of the Routing Context field to 0 (for top level). Finally, the host sets the Exit-Type bit of the Routing Context field to 0 (for host-driven). (The need for the Exit-Type bit is explained shortly.) The intra-domain router's for- warding tables are configured such that this causes the packet to be Pip WG, Expires Aug. 15, 1993 [Page 25] INTERNET-DRAFT Pip Near-term Arch February 1993 routed to the indicated provider. When using transit-driven exit routing, there are two modes of opera- tion. The first, called destination-oriented, is used when the routers internal to a domain have external routing information. The second, called provider-rooted, is used when the routers internal to a domain do not have any external routing information. (With IP, this case is called default routing. In the case of IP, however, default routing does not allow an intelligent choice of multiple exit points.) With destination-oriented (transit-driven) exit routing, the FTIF Offset is set to point to the top-level FTIF of the destination Pip Address. The host sets the Level and Metalevel parts of the Routing Context field to 0 (for top level). The setting of the Exit-Type bit of the Routing Context field is irrelevant in this case. With provider-rooted exit routing, the host arbitrarily chooses a source Pip Address (and therefore, a provider). (Note that if the Pip Header Server is tracking inter-domain routing, then it chooses the appropriate provider.) The host points the FTIF Offset to the lowest-level FTIF above the intra-domain part of the Pip Address (intra-domain routers are con- figured to know how to route this towards the border router of the indicated provider). The host sets the Exit-Type bit of the Routing Context field to 1 (for provider-driven). As a result of this Exit- Type bit setting, the border router, when it receives the packet, does not automatically forward the packet to the indicated provider. Instead, the border router examines subsequent FTIFs until it reaches the first FTIF after the top-level source FTIF (which is the top- level of the destination Pip Address, unless a policy route has been installed in the Routing Directive). It then determines if indeed the indicated provider is the best route to the destination (or next hop of policy route). If the indicated provider is the best route, then the packet is for- warded to that provider. If it is not, then the border router keeps the FTIF Offset in its original position, and tunnels the packet to the correct border router. It also sends a PCMP Provider Redirect to the source host, indicating the appropriate provider to use. When the correct border router receives the packet, it untunnels it, modi- fies the source Pip Address to match that of the best-route provider, and forwards the packet to the provider. Pip WG, Expires Aug. 15, 1993 [Page 26] INTERNET-DRAFT Pip Near-term Arch February 1993 Because of the PCMP Provider Redirect, subsequent packets go to the best-route border router. If, however, the best route changes to become another provider, then the previous best-route border router tunnels the packet to the best-route border router and sends the PCMP Provider Redirect. 8.2. Intra-domain Networking With intra-domain networking (where both source and destination are in the private network), there are two scenarios of concern. In the first, the destination address shares a providerPart with the source address, and so the destination is known to be intra-domain even before a packet is sent. In the second, the destination address does not share a providerPart with the source address, and so the source host doesn't know that the destination is reachable intra-domain. In the first case, the Pip Addresses in the Routing Directive need not contain the providerPart. In the absence of information to the contrary (discussed below), the host only includes those levels of the Pip Address below the matching prefixes. For instance, if the source Pip Address is 1.2.3,4.5.6 and the destination Pip Address is 1.2.3,4.7.8, then only 7.8 need be included for the destination address in the Routing Directive. (The comma "," in the address indicates the metalevel boundary between providerPart and subscriber- Part.) The metalevel and level are set accordingly. In the second case, it is desirable to use the Pip Header Server to determine if the destination is intra-domain or inter-domain. The Pip Header Server can do this by monitoring intra-domain routing. (This is done by having the Pip Header Server run the intra-domain routing algorithm, but not advertise any destinations.) Thus, the Pip Header Server can determine if the providerPart can be eliminated from the address, as described in the last paragraph, or cannot and must conform to the rules of exit routing as described in the previ- ous section. If the Pip Header Server does not monitor intra-domain routing, how- ever, then the following actions occur. In the case of host-driven exit routing, the packet will be routed to the stated provider, and an external path will be used to reach an internal destination. (The moral here is to not use host-driven exit routing unless the Pip Header Server is privy to routing information, both internal and Pip WG, Expires Aug. 15, 1993 [Page 27] INTERNET-DRAFT Pip Near-term Arch February 1993 external.) In the case of transit-driven exit routing, the packet sent by the host will eventually reach a router that knows that the destination is intra-domain. This router will forward the packet towards the destination, and at the same time send a PCMP Reformat Transit Part message to the host. This message tells the host how much of the Pip Address is needed to route the packet. In addition to using the PCMP Reformat Transit Part message to remove layers of hierarchy in the Pip Addresses in the Routing Directive, the PCMP Reformat Transit Part message can direct the host to add layers of hierarchy to the Pip Addresses. This is necessary in cer- tain situations where a router serves multiple parts of the hierar- chy. For instance, assume that a router R is connected to other routers serving address prefixes 1.2.3, 1.2.4, and 1.3.4. Assume that a host under 1.2.3 has a packet to send to a host under 1.2.4. The host sets the level to 2 and puts only the third layer in the packet (des- tination = 4). When router R receives the packet, it doesn't know if the packet is destined for 1.2.4 or 1.3.4, because of the ambiguity at level 2. Thus, the router sends the host a PCMP Reformat Transit Part indicating how many additional layers are needed to correct the ambiguity. The host can cache this information for an arbitrarily long time, because a subsequent PCMP Reformat Transit Part message can be used to reduce the number of layers required if in the future the topology changes so that there is no longer an ambiguity. (Note that the need for this message is based on the desire to minimize the amount of information in the packet, and to optimize forwarding speed. The need for this message would disappear if we chose to always put full Pip Addresses in the Pip header.) 9. Pip Header Server Two new components of the Pip Architecture are the Pip/IP Translator and the Pip Header Server. The Pip/IP Translator is only used for transition from IP to Pip, and otherwise would not be necessary. The Pip Header Server, however, is a new architectural component. Pip WG, Expires Aug. 15, 1993 [Page 28] INTERNET-DRAFT Pip Near-term Arch February 1993 The purpose of the Pip Header Server is to form a Pip Header. It is useful to form the Pip header in a separate box because a) in the future (as policy routing matures, for instance), significant amounts of information may be needed to form the Pip header--too much infor- mation to distribute to all hosts, and b) it won't be possible to evolve all hosts at the same time, so the existence of a separate box that can spoon-feed Pip headers to old hosts is necessary. (It is impossible to guarantee that no modification of Pip hosts is neces- sary for any potential evolution, but being able to form the header in a server, and hand it to an outdated host, is a large step in the right direction.) (Note that policy routing architectures commonly if not universally require the use of some kind of "route server" for calculating policy routes. The Pip Header Server is, among other things, just this server. Thus, the Pip Header Server does not so much result from the fact that Pip itself is more complex than IP or other "IPv7" propo- sals. Rather, the Pip Header Server reflects the fact that the Pip Architecture has more functionality than ROAD architectures supported by the simpler proposals.) We note that for the near-term architecture hosts themselves will by-and-large have the capability of forming Pip headers. The excep- tion to this will be the case where the Pip Header Server wishes to monitor inter-domain routing to enhance provider selection. Thus, the Pip Header Server role will be largely limited to evolution (see section 16). 9.1. Forming Pip Headers Forming a Pip header is more complex than forming an IP header because there are many more choices to make. At a minimum, one of multiple Pip Addresses (both source and destination) must be chosen. In the near future, it will also be necessary to choose a TOS. After DNS information about the destination has been received, the the following information is available to the Pip header formation function. 1. From DNS: The destination's providers (either directly connected or nearby enough to justify making a policy decision about), and Pip WG, Expires Aug. 15, 1993 [Page 29] INTERNET-DRAFT Pip Near-term Arch February 1993 the names, types, and access restrictions of those providers. 2. From the source host: The application type (and thus, the desired service), and the user access restriction classes. 3. From local configuration: The source's providers, and the names, types, and access restrictions of those providers. 4. From inter-domain routing: The routes chosen by inter-domain to all top level providers. (Note that inter-domain routing in the Pip near-term architecture is path-vector. Because of this, the Pip Header Server does not obtain enough information from inter-domain routing to form a policy route. When the technol- ogy to do this matures, it can be installed into Pip Header Servers.) The inter-domain routing information is optional. If it is used, then probably a Pip Header Server is necessary, to limit this information to a small number of systems. There may also be arbitrary policy information available to the Pip header formation function. This architecture does not specify any such information. The Pip header formation function then goes through the following process: 1. Determine if the packet is intra-domain (see section 8.2). If the packet is intra-domain, strip off any common prefix between source and destination Pip Addresses, and route the packet. Otherwise, execute the following steps. 2. Eliminate any source and destination providers that do not allow this traffic based on user access restriction. 3. Eliminate any source and destination providers that cannot satisfy the service requirements of the application. (As the internet gains experience with traffic management, this step can take into consideration TOS parameters.) 4. Eliminate any source and destination providers for local policy reasons. 5. For each remaining source provider, choose the most appropriate Pip WG, Expires Aug. 15, 1993 [Page 30] INTERNET-DRAFT Pip Near-term Arch February 1993 destination provider. This choice is based on provider name, type, and on the route to the provider. (For instance, if the destination's provider is the same as the source's provider, then they should be paired. Note that even in the absence of routing information, an informed choice is still usually possi- ble based on provider name and type alone.) 6. Rank order the source/destination provider pairs from most pre- ferred to least (based on local policy information, cost, and expected quality of service). The Pip Header formation function then returns the ordered pairs of source and destination addresses to the source host in the PHP response message. The form of the source address takes into con- sideration the type of exit routing in use in the source's domain (that is, the Routing Context and FTIF Offset is indicated, see sec- tion 8.1). Any additional information, such as PDN Address, is also returned. With this information, the source host can now establish communications and properly respond to PCMP messages. Note that if Pip evolves to the point where the Transit Part of the Pip header is no longer compatible with the current Transit Part, and the querying host has not been updated to understand the new Transit Part, then the PHP response message contains a bit map of the Transit Part. The host puts this bit map into the Transit Part of the transmitted Pip header even though it does not understand the seman- tics of the Transit Part. 9.2. Pip Header Protocol (PHP) The Pip Header Protocol (PHP) is a simple query/response protocol used to exchange information between the Pip host and the Pip Header Server [7]. In the query, the Pip host includes (among other things) the domain name of the destination it wishes to send Pip packets to. (Thus, the PHP query serves as a substitute for the DNS query.) The PHP query also contains 1) User Access Restriction Classes, 2) Application Types, and 3) host version. The host version tells the Pip Header Server what features are installed in the host. Thus, the Pip Header Server is able to determine if the host can format its own Pip header based on DNS information, or whether the Pip Header Server Pip WG, Expires Aug. 15, 1993 [Page 31] INTERNET-DRAFT Pip Near-term Arch February 1993 needs to do it on behalf of the host. In the future, the PHP query will also contain desired TOS (possibly in lieu of Application Type). (Note that this information could come from the application. Thus, the application interface to PHP (the equivalent of gethostbyname()) must pass this information.) 9.3. Application Interface In order for a Pip host to generate the information required in the PHP query, there must be a way for the application to convey the information to the PHP software. The host architecture for doing this is as follows. A local "Pip Header Client" (the source host analog to the Pip Header Server) is called by the application (instead of the current gethost- byname()). The application provides the Pip Header Client with either the destination host domain name or the destination host Pip ID, and other pertinent information such as user access restriction class and TOS. The Pip Header Client, if it doesn't have the infor- mation cached locally, queries the Pip Header Server and receives an answer. (Remember that the Pip Header Server can be co-resident with the host.) Once the Pip Header Client has determined what the Pip header(s) are, it assigns a local handle to the headers, returns the handle to the application, and configures the Pip packet processing engine with the handle and related Pip Headers. The application then issues packets to the Pip layer (via intervening layers such as transport) using the local handle. 10. Routing Algorithms in Pip The architecture for operating routing algorithms in Pip reflects the clean partitioning of routing contexts in the Pip header. Thus, routing in the Pip architecture is nicely modularized. Whereas routing in IP is basically partitioned into "egp" and "igp", routing in Pip is partitioned into whatever routing contexts exist. In the case of the near-term Pip architecture, each address family Pip WG, Expires Aug. 15, 1993 [Page 32] INTERNET-DRAFT Pip Near-term Arch February 1993 (Hierarchical Pip Address, CBT Pip Multicast Address, Class D Pip Multicast Address) has its own routing algorithms. Within the Hierarchical Pip Address, there are multiple hierarchical levels. Wherever two routers connect, or two levels interface (either in a single router or between routers), two decisions must be made: 1) what information should be exchanged (that is, what of one side's routing table should be propagated to the other side), and 2) what routing algorithm should be used to exchange the information? The first decision is discussed in section 10.1 below (Routing Infor- mation Filtering). The latter decision is discussed here. Conceptually, and to a large extent in practice, the routing algo- rithms at each level are cleanly partitioned. This partition is much like the partition between "egp" and "igp" level routing in IP, but with Pip it exists at each level of the hierarchy. At the top-level of the Pip Address hierarchy, a path-vector routing algorithm is used. Path-vector is more appropriate at the top level than link-state because path-vector does not require agreement between top-level entities (providers) on metrics in order to be loop-free. (Agreement between the providers is likely to result in better paths, but the Pip Architecture does not assume such agree- ment.) The top-level path-vector routing algorithm is based on BGP4/IDRP technology, but modified to reflect Pip idiosyncrasies such as the Routing Context. At any level below the top level, it is a local decision as to what routing algorithm technology to run. However, the path-vector routing algorithm is generalized so that it can run at multiple levels of the Pip Address hierarchy. Thus, a lower level domain can choose to take advantage of the path-vector algorithm, or run another, such as a link-state algorithm. The Pip path-vector routing algorithm is called MLPV [11], for Multi-Level Path-Vector (pronounced "milpiv"). Normally, information is exchanged between two separate routing algo- rithms by virtue of the two algorithms co-existing in the same router. For instance, a border router is likely to participate in an exchange of routing information with provider routers, and still run the routing algorithm of the internal routers. If the two algorithms are different routing technologies (for instance, link-state versus distance-vector) then internal conversion translates information from one routing algorithm to the form of the other. Pip WG, Expires Aug. 15, 1993 [Page 33] INTERNET-DRAFT Pip Near-term Arch February 1993 In some cases, however, two routing algorithms that exchange informa- tion will exist in different routers, and will have to exchange information over a link. If these two algorithms are different tech- nologies, then they need a common means of exchanging routing infor- mation. While strictly speaking this is a local matter, MLPV can also serve as the interface between two disparate routing algorithms. Thus, all routers should be able to run MLPV, if for no other reason than to exchange information with other, perhaps proprietary, routing protocols. MLPV is designed to be extendible with regards to the type of routes that it calculates. It uses the Pip Object parameter identification number space to identify what type of route is being advertised and calculated [10]. Thus, to add new types of routes (for instance, new types of service), it is only necessary to configure the routers to accept the new route type, define metrics for that type, and criteria for preferring one route of that type over another. 10.1. Routing Information Filtering Of course, the main point behind having hierarchical routing is so that information from one part of the hierarchy can be reduced when passed to another. In general, reduction (in the form of aggrega- tion) takes place when passing information from the bottom of the hierarchy up. However, Pip uses tunneling and exit routing to allow information from the top to be reduced when it goes down. When two routers become neighbors, they can determine what hierarchi- cal levels they have in common by comparing Pip Addresses. For instance, if two neighbor routers have Pip Addresses 1.2.3.4 and 1.2.8.9.14 respectively, then they share levels 0 and 1, and are dif- ferent at levels below that. (0 is the highest level, 1 is the next highest, and so on.) As a general rule, these two routers exchange level 0, level 1, and level 2 routing information, but not level 3 or lower routing information. In other words, both routers know how to route to all things at the top level (level 0), how to route to all level 1 things with "1" as the level 0 prefix, and how to route to all level 2 things with "1.2" as the level 1 prefix. In the absence of other instructions, two routers will by default exchange information about all levels from the top down to the first level at which they have differing Pip Addresses. In practice, Pip WG, Expires Aug. 15, 1993 [Page 34] INTERNET-DRAFT Pip Near-term Arch February 1993 however, this default exchange is as likely to be followed as not. For instance, assume that 1.2.3.4 is a provider router, and 1.2.8.9.14 is a subscriber router. (Note that 1.2.8 is the prefix given the subscriber by the provider.) Assume also that the sub- scriber network is using destination-oriented transit-driven exit routing (see section 8.1). Finally, assume that router 1.2.8.9.14 is the subscriber's only entry point into provider 1 (other routers pro- vide entry points to other providers). In this case, 1.2.8.9.14 does not need to know about level 2 or level 1 areas in the provider (that is, it does not need to know about 1.2.4..., 1.2.5..., or 1.3..., 1.4..., and so on). Thus, 1.2.8.9.14 should be configured to inform 1.2.3.4 that it does not need level 1 or 2 information. As another example, assume still that 1.2.3.4 is a provider and 1.2.8.9.14 is a subscriber. However, assume now that the subscriber network is using host-driven exit routing. In this case, the sub- scriber does not even need to know about level 0 information, because all exit routing is directed to the provider of choice, and having level 0 information therefore does not influence that choice. As a third example, in the case where border routers of a transit domain tunnel through the interior routers, the border router does not need to exchange external routing information with the internal routers. MLPV supports this mode of operation by allowing border routers to exchange information across domains in the same fashion as BGP or IDRP, thus supporting the tunneling feature of Pip. 11. Transition The transition scheme for Pip has two major components, 1) transla- tion, and 2) encapsulation. Translation is required to map the Pip Address into the IP address and vice versa. Encapsulation is used for one Pip router (or host) to exchange packets with another Pip router (or host) by tunneling through intermediate IP routers. The Pip transition scheme is basically a set of techniques that allows existing IP "stuff" and Pip to coexist, but within the limita- tions of IP address depletion (though not within the limitations of IP scaling problems). By this I mean that an IP-only host can only exchange packets with other hosts in a domain where IP numbers are Pip WG, Expires Aug. 15, 1993 [Page 35] INTERNET-DRAFT Pip Near-term Arch February 1993 unique. Initially this domain includes all IP hosts, but eventually will include only hosts within a private domain. The IP "stuff" that can exist includes 1) whole IP domains, 2) individual IP hosts, 3) IP-oriented TCPs, and 4) IP-oriented applications. 11.1. Justification for Pip Transition Scheme Note that all transition to a bigger address require translation. It cannot be avoided. The major choices one must make when deciding on a translation scheme are: 1. Will we require a contiguous infrastructure consisting of the new protocol, or will we allow tunneling through whatever remains of the existing IP infrastructure at any point in time? 2. Will we allow global connectivity between IP machines after IP addresses are no longer globally unique, or not? (In other words, will we use a NAT scheme or not?) Concerning question number 1; while it is desirable to move as quickly as possible to a contiguous Pip (or SIP or whatever) infras- tructure, especially for purposes of improved scaling, it is fantasy to think that the whole infrastructure will cut over to Pip quickly. Furthermore, during the testing stages of Pip, it is highly desirable to be able to install Pip in any box anywhere, and by tunneling through IP, create a virtual Pip internet. Thus, it seems that the only reasonable answer to question number 1 is to allow tunneling. Concerning question number 2; it is highly desirable to avoid using a NAT scheme. A NAT (Network Address Translation) scheme is one whereby any two IP hosts can communicate, even though IP addresses are not globally unique. This is done by dynamically mapping non- unique IP addresses into unique ones in order to cross the infras- tructure. NAT schemes have the problems of increased complexity to maintain the mappings, and of translating IP addresses that reside within application data structures (such as the PORT command in FTP). This having been said, it is conceivable that the new protocol will not be far enough along when IP addresses are no longer unique, and therefore a NAT scheme becomes necessary. It is possible to employ a NAT scheme at any time in the future without making it part of the Pip WG, Expires Aug. 15, 1993 [Page 36] INTERNET-DRAFT Pip Near-term Arch February 1993 intended transition scheme now. Thus, we can plan for a NATless transition now without preventing the potential use of NAT if it becomes necessary. 11.2. Architecture for Pip Transition Scheme The architecture for Pip Transition is that of a Pip infrastructure surrounded by IP-only "systems". The IP-only "systems" surrounding Pip can be whole IP domains, individual IP hosts, an old TCP in an otherwise Pip host, or an old application running on top of a Pip- smart TCP. The Pip infrastructure will initially get its internal connectivity by tunneling through IP. Thus, any Pip box can be installed any- where, and become part of the Pip infrastructure by configuration over a "virtual" IP. Of course, it is desirable that Pip boxes be directly connected to other Pip boxes, but very early on this is the exception rather than the rule. Two neighbor Pip systems tunneling through IP simply view IP as a "link layer" protocol, and encapsulate Pip over IP just as they would encapsulate Pip over any other link layer protocol. There is no automatic configuration of neighbor Pip systems over IP. Manual con- figuration (and careful "virtual topology" engineering) is required. In the remainder of this section, we do not distinguish between a virtual Pip infrastructure on IP, and a pure Pip infrastructure. Given the model of a Pip infrastructure surrounded by IP, there are 5 possible packet paths: 1. IP - IP 2. IP - Pip - IP 3. IP - Pip 4. Pip - IP 5. Pip - Pip Pip WG, Expires Aug. 15, 1993 [Page 37] INTERNET-DRAFT Pip Near-term Arch February 1993 The first three paths involve packets that originate at IP-only hosts. In order for an IP host to talk to any other host (IP or not), the other host must be addressable within the context of the IP host's 32-bit IP address. Initially, this "IP-unique" domain will include all IP hosts. When IP addresses become no longer unique, the IP-unique domain will include a subset of all hosts. At a minimum, this subset will include those hosts in the IP-host's private domain. However, it makes sense also to arrange for the set of all "public" hosts, basically anonymous ftp servers and mail gateways, to be in this subset. In other words, a portion of IP address space should be set aside to remain globally unique, even though other parts of the address space are being reused. 11.3. Translation between Pip and IP packets Paths 2 and 4 involve translation from Pip to IP. This translation is straightforward, as all the information needed to create the IP addresses is in the Pip header. In particular, Pip IDs have an encoding that allows them to contain an IP address (again, one that is unique within an IP-unique domain). Whenever a packet path involves an IP host on either end, both hosts must have IP addresses. Thus, translating from Pip to IP is just a matter of extracting the IP addresses from the Pip IDs and forming an IP header. Translating from an IP header to a Pip header is more difficult, because the 32-bit IP address must be "translated" into a 64-bit Pip ID and a Pip Address. There is no algorithm for making this transla- tion. A table mapping IP addresses (or, rather, network numbers) to Pip IDs and Pip Addresses is required. Since such a table, called the 4to7 Table, must potentially map every IP address, we choose to use dynamic discovery and caching to maintain the 4to7 table. We choose also to use DNS as the means of discovering the mappings. Thus, DNS contains records that map IP address to Pip ID and Pip Address. In the case where the host represented by the DNS record is a Pip host (packet path 3), the Pip ID and Pip Address are those of the host. In the case where the host represented by the DNS record is an IP-only host (packet paths 2 and 4), the Pip Address is that of the Pip/IP translating gateway that is used to reach the IP host. Thus, an IP-only domain must at least be able to return Pip informa- tion in its DNS records (or, the parent DNS domain must be able to do it on behalf of the child). Pip WG, Expires Aug. 15, 1993 [Page 38] INTERNET-DRAFT Pip Near-term Arch February 1993 With paths 2 and 3 (IP-Pip-IP and IP-Pip), the initial translating gateway (IP to Pip) makes the DNS query. It stores the IP packet while waiting for the answer. The query is an inverse address (in- addr) using the destination IP address. The translating gateway can cache the record for an arbitrarily long period, because if the map- ping ever becomes invalid, a PCMP Invalid Address message flushes the cache entry. In the case of path 4 (Pip-IP), however, the Pip Address of the translating gateway is returned directly to the source host-- piggybacked on the DNS record that is normally returned. Thus this scheme incurs only a small incremental cost over the normal DNS query. 11.4. Translating between PCMP and ICMP The only ICMP/PCMP messages that are translated are the Destination Unreachable, Echo, and PTMU Exceeded messages. The portion of the offending IP/Pip header that is attached to the ICMP/PCMP message is not translated. 11.5. Translating between IP and Pip Routing Information It is not necessary to pass IP routing information into the Pip infrastructure. The mapping of IP address to Pip Address in DNS allows Pip to find the appropriate gateway to IP in the context of Pip addresses only. It is impossible to pass Pip routing information into IP routers, since IP routers cannot understand Pip addresses. IP domains must therefore use default routing to reach IP/Pip translators. 11.6. Old TCP and Application Binaries in Pip Hosts A Pip host can be expected to have an old TCP above it for a long time to come, and a new (Pip-smart) TCP can be expected to have old Pip WG, Expires Aug. 15, 1993 [Page 39] INTERNET-DRAFT Pip Near-term Arch February 1993 application binaries running over it for a long time to come. Thus, we must have some way of insuring that the TCP checksum is correctly calculated in the event that one or both ends is running Pip, and one or both ends has an old TCP binary. In addition, we must arrange to allow applications to interface with TCP using a 32-bit "address" only, even though those 32 bits get locally translated into Pip Addresses and IDs. As stated above, in all cases where a Pip host is talking to an IP- only host, the Pip host has a 32-bit IP address. This address is embedded in the Pip ID such that it can be identified as an IP address from inspection of the Pip ID alone. The TCP pseudo-header is calculated using the Payload Length and Pro- tocol fields, and some or all of the Source and Dest Pip IDs. In the case where both Source and Dest Pip IDs are IP-based, only the 32-bit IP address is included in the pseudo-header checksum calculation. Otherwise, the full 64 bits are used. (Note that using the full Pay- load Length and Protocol fields does not fail when old TCP binaries are being used, because the values for those fields must be within the 16-bit and 8-bit limits for TCP to correctly operate.) The reason for only using 32 bits of the Pip ID in the case of both ends using an IP address is that an old TCP will use only 32 bits of some number to form the pseudo-header. If the entire 64 bits of the Pip ID were used, then there would be cases where no 32-bit number could be used to insure that the correct checksum is calculated in all cases. Note that in the case of an old TCP on top of Pip, "Pip" (actually, a Pip daemon) must create a 32-bit number that can both be used to 1) allow the Pip layer to correctly associate a packet from the TCP layer with the right Pip header, and 2) cause the TCP layer to calcu- late the right checksum. (This number is created when the applica- tion initiates a DNS query. A Pip daemon intervenes in this request, calculates a 32 bit number that the application/TCP can use, and informs the Pip layer of the mapping between this 32 bit number and the full Pip header.) When the destination host is an IP only host, then this 32-bit number is nothing more than the IP address. When the destination host is a Pip host, then this 32-bit number is some number generated by Pip to "fool" the old TCP into generating the right checksum. This 32-bit number can normally be the same as the lower 32 bits of the Pip ID. However, it is possible that two or more active TCP connections is Pip WG, Expires Aug. 15, 1993 [Page 40] INTERNET-DRAFT Pip Near-term Arch February 1993 established to different hosts whose Pip IDs have the same lower 32 bits. In this case, the Pip layer must generate a different 32-bit number for each connection, but in such a way that the sum of the two 16-bit components of the 32-bit numbers are the same as the sum of the two 16-bit components of the lower 32 bits of the Pip IDs. In the case where an old Application wants to open a socket using an IP address handed to it (by the user or hard-coded), and not using a domain name, then it must be assumed that this IP address is valid within the IP-unique domain. To form a Pip ID out of this 32-bit number, the host appends the high-order 24 bits of its own Pip ID, plus the IP-address-identifier-byte value, to the 32-bit IP address. 12. Pip Address and ID Auto-configuration One goal of Pip is to make networks as easy to administer as possi- ble, especially with regards to hosts. Certain aspects of the Pip architecture make administration easier. For instance, the ID field provides a network layer "anchor" around which address changes can be administered. This section discusses three aspects of autoconfiguration; 1) domain-wide Pip Address prefix assignment, 2) host Pip Address assignment, and 3) host Pip ID assignment. 12.1. Pip Address Prefix Administration A central premise behind the use of provider-rooted hierarchical addresses is that domain-wide address prefix assignment and re- assignment is straight-forward. This section describes that process. Pip Address prefix administration limits required manual prefix con- figuration to DNS and border routers. This is the minimum required manual configuration possible, because both border routers and DNS must be configured with prefix information for other reasons. DNS must be configured with prefix information so that it can reply to address queries. DNS files are structured so that the prefix is administered in only one place (that is, every host record does not have to be changed to create a new prefix). Border routers must be Pip WG, Expires Aug. 15, 1993 [Page 41] INTERNET-DRAFT Pip Near-term Arch February 1993 configured with prefix information in order to advertise exit routes internally. Note in particular that no internal (non-border) routers or hosts need ever be manually configured with any externally derived address- ing information. All internal routers that are expected to fall under a common provider-prefix must, however, be configured with a "group ID" taken from the Pip ID space. Each border router is configured with the following information. 1. The type of exit routing for the domain. This tells the border router whether or not it needs to advertise external routes internally. 2. The address prefix of the providers that the border is directly connected to. This prefix information includes any metalevel boundaries above the subscriber/provider metalevel boundary (called simply the subscriber metalevel). 3. Other information about the provider (provider name, type, user access restriction classes). 4. A list of common-provider-prefix group IDs that should receive the auto-configuration information. (The default is that only systems that share a group ID with the border router will receive the information.) This information is injected into the intra-domain routing algorithm. It is automatically spread to all routers indicated by the group ID list. This way, the default behavior is for the information to be automatically constrained to the border router's "area". When a non-border router receives this information, it 1) records the route to the providers in its forwarding table, and 2) advertises the information to hosts in the router discovery protocol [9]. Thus hosts learn not only their complete address, but information on how to do exit routing. 12.2. Host Pip Address Assignment Pip WG, Expires Aug. 15, 1993 [Page 42] INTERNET-DRAFT Pip Near-term Arch February 1993 Unless a host does not wish to use ID-tailed Pip Addresses (see sec- tion 4.1.2), host Pip Address assignment is trivial. (The near-term Pip Architecture doesn't specify a means for a host to obtain a non- ID-tailed Pip Address.) When a host attaches to a subnet, it learns the Pip Address of the attached routers. The host simply adopts these Pip Addresses as its own. The Pip Address gets a packet to the host's subnet, and the host's Pip ID is used to route across the sub- net. When the routers advertise new addresses (for instance, because of a new provider), the host adopts the new addresses. 12.3. Host Pip ID Assignment When a host boots, it forms either a globally unique Pip ID using the IEEE 802 Pip ID type (if the host has an IEEE 802 address), or it forms a locally unique Pip ID using the Local Pip ID type [5]. (The Local Pip ID type ensures that the Pip ID is at least unique on the subnet.) This Pip ID is adequate to communicate locally, and may be adequate to communicate globally, but is probably not the final Pip ID that the host should have, because it doesn't contain any organizational hierarchy information, and therefore can't be used for auditing or inverse DNS lookups. To obtain its final Pip ID, the host uses its derived Pip ID and discovered Pip Address to send a DNS query for its own Pip ID (in order to make this query, the host must know its own domain name). If DNS hasn't been configured with the new host's Pip ID, then the host must continue to use its derived Pip ID. (Hopefully in the future it is possible for the host to automatically update DNS with its Pip Address, and for some kind of "Pip ID server" to automati- cally assign Pip IDs to hosts. This is not a feature of the near- term Pip Architecture, however.) 13. Pip Control Message Protocol (PCMP) The Pip analog to ICMP is PCMP [8]. The near-term Pip architecture defines the following PCMP messages: Pip WG, Expires Aug. 15, 1993 [Page 43] INTERNET-DRAFT Pip Near-term Arch February 1993 1. Local Redirect 2. Destination Unreachable 3. Echo 4. Parameter Problem 5. Router Discovery 6. PMTU Exceeded 7. Provider Redirect 8. Invalid Address 9 Reformat Transit Part 10. Unknown Parameter 11. Host Mobility 12. Exit PDN Address The first four PCMP messages (Local Redirect, Destination Unreach- able, Echo, and Parameter Problem) operate almost identically to their ICMP counterparts. The Router Discovery PCMP message operates as ICMP's, with the excep- tion that a host derives its Pip Address from it. The PMTU Exceeded message operates as ICMP's, with the exception that the Pip header size of the offending Packet is also given. This allows the source host transport to determine how much smaller the packet PMTU should be from the advertised subnet PMTU. Note that if an occasional option, such as the PDN Address option, needs to be attached to one of many packets, and that this option makes the packet larger than the PMTU, then it is not necessary to modify the MTU coming from transport. Instead, that packet can be fragmented by the host's Pip forwarding engine. (Pip specifies fragmentation/reassembly for hosts but not for routers. The fragmen- tation information is in a Pip Option.) The Provider Redirect, Invalid Address, Reformat Transit Part, Pip WG, Expires Aug. 15, 1993 [Page 44] INTERNET-DRAFT Pip Near-term Arch February 1993 Unknown Parameter, Host Mobility, and Exit PDN Address PCMP messages are new. The Provider Redirect PCMP message is used to inform the source host of a preferable exit provider to use when provider-rooted, transit- driven exit routing is used (see section 8.1). The Invalid Address PCMP message is used to inform the source host that none of the IDs of the destination host match that of the Pip packet. The purpose of this message is to flush incorrect ID/Pip Address bindings in hosts and Pip Header Servers (see section 6.2). The Reformat Transit Part PCMP message has both near-term Pip archi- tecture functions and evolution functions. Near-term, the Reformat Transit Part PCMP message is used to indicate to the source whether it has too few or too many layers of address in the Routing Directive (see section 8.2). Long-term, the Reformat Transit Part PCMP message is able to arbitrarily modify the transit part transmitted by the host, as encoded by a bit string. The Unknown Parameter PCMP message is used to inform the source host that the router does not understand a parameter in either the Han- dling Directive, the Routing Context, or the Transit Options. The purpose of this message is to assist evolution (see section 16.1). The Host Mobility PCMP message is sent by a host to inform another host (for instance, the host's Mobile Address Server) that it has a new address (see section 14). The main use of this packet is for host mobility, though it can be used to manage any address changes, such as because of a new prefix assignment. The Exit PDN Address PCMP message is used to manage the function whereby the source host informs the PDN entry router of the PDN Address of the exit PDN system (see section 15). When a router needs to send a PCMP message, it sends it to the source Pip Address. If the Pip header is in a tunnel, then the PCMP message is sent to the router that is the source of the tunnel. Depending on the situation, this may result in another PCMP message from the source of the tunnel to the true source (for instance, if the source of the tunnel finds that the dest of the tunnel can't be reached, it may send a destination unreachable to the source host). Pip WG, Expires Aug. 15, 1993 [Page 45] INTERNET-DRAFT Pip Near-term Arch February 1993 14. Host Mobility Depending on how security conscience a host is, and what security mechanisms a host has available, mobility can come from Pip "for free". If a host is willing to accept a packet by just looking at source and destination Pip ID, and if the host simply records the source Pip Address on any packet it receives as the appropriate return address to the source Pip ID, then mobility comes automati- cally. That is, when a mobile host gets a new Pip Address, it simply puts that address into the next packet it sends. When the other host receives it, it records the new Pip Address, and start sending return packets to that location. The security aspect of this is that this type of operation leads to an easy way to spoof the (internet level) identity of a host. That is, absent any other security mechanisms, any host can write any Pip ID into a packet. (Cross-checking a source Pip ID against the source Pip Address at least makes spoofing of this sort as hard as with IP. This is discussed below.) The above simple host mobility mechanism does not work in the case where source and destination hosts obtain new Pip Addresses at the same time and the old Pip addresses no longer work, because neither is able to send its new address information directly to the other. Furthermore, if a host wishes to be more secure about authenticating the source Pip ID of a packet, then the above mechanism also is not satisfactory. In what follows, the complete host mobility mechanism is described. Pip uses the Mobile Address Server and the PCMP Host Mobility message to manage host mobility; The Mobile Address Server is a non-mobile host (or router acting as a host) that keeps track of the active address of a mobile host. The Pip ID and Address of the Mobile Address Server is configured into the mobile host, and in DNS. When a host X obtains information from DNS about a host Y, the Pip ID and Address of host Y's Mobile Address Server is among the information. (Also among the information is host Y's "permanent" address, if host Y has one. If host Y is so mobile that it doesn't have a permanent address, then no permanent address is returned by DNS. In particular, note that DNS is not intended to keep track of a mobile host's active address.) Given the destination host's (Y) permanent ID and Address, and the Pip WG, Expires Aug. 15, 1993 [Page 46] INTERNET-DRAFT Pip Near-term Arch February 1993 Mobile Address Server's permanent IDs and Addresses, the source host (X) precedes as follows. X tries to establish communications with Y using one of the permanent Addresses. If this fails (or if at any time X cannot contact Y), X sends a PCMP Mobile Host message to the Mobile Address Server requesting the active address for Y. (Note that X can determine that it cannot contact Y from receipt of a PCMP Destination Unreachable or a PCMP Invalid Address message.) The Mobile Address Server responds to X with the active Pip Addresses of Y. (Of course, Y must inform its Mobile Address Server(s) of its active Pip Addresses when it knows them. This also is done using the PCMP Mobile Host message. Y also informs any hosts that it is actively communicating with, using either a regular Pip packet or with a PCMP Mobile Host message. Thus, usually X does not need to contact the Mobile Address Server to track Y's active address.) If the address that X already tried is among those returned by Y, then the source host has the option of either 1) continuing to try the same Pip Address, 2) trying another of Y's Pip Addresses, 3) waiting and querying the Mobile Address Server again, or 4) giving up. If the Mobile Address Server indicates that Y has new active Pip Addresses, then X chooses among these in the same manner that it chooses among multiple permanent Pip Addresses, and tries to contact Y. 14.1. PCMP Mobile Host message There are two types of PCMP Mobile Host messages, the query and the response. The query consists of the Pip ID of the host for which active Pip Address information is being requested. The response consists of a Pip ID, a sequence number, a set of Pip Addresses, and a signature field. The set of Pip Addresses includes all currently usable addresses of the host indicated by the Pip ID. Thus, the PCMP Mobile Host message can be used both to indicate a newly obtained address, and to indicate that a previous address is no longer active (by that addresses' absence in the set). The sequence number indicates which is the most recent information. It is needed to deal with the case where an older PCMP Mobile Host Pip WG, Expires Aug. 15, 1993 [Page 47] INTERNET-DRAFT Pip Near-term Arch February 1993 response is received after a newer one. The signature field is a value that derives from encrypting the sequence number and the set of Pip Addresses. For now, the encryp- tion algorithms used, how to obtain keys, and so on are for further study. 14.2. Spoofing Pip IDs This section discusses host mechanisms for decreasing the probability of Pip ID spoofing. The mechanisms provided in this version of the near-term Pip architecture are no more secure than DNS itself. It is hoped that mechanisms and the corresponding infrastructure needed for better internetwork layer security can be installed with whatever new IP protocol is chosen. After a host makes a DNS query, it knows: 1. The destination host's Pip ID, 2. The destination host's permanent Pip Addresses, and 3. The destination host's Mobile Address Server's Pip ID and Addresses. Note that the DNS query can be a normal one (based on domain name) or an inverse query (based on Pip ID or Pip Address, though the latter is more likely to succeed, since the Pip ID may be flat and therefore not suitable for an inverse lookup). The inverse query is done when the host did not initiate the packet exchange, and therefore doesn't know the domain name of the remote (initiating) host. If the destination host is not mobile, then the source host can check the source Pip Address, compare it with those received from DNS, and reject the packet if it does not match. This gives spoof protection equal to that of IP (which, admittedly, is not that much). If the destination host is mobile and obtains new Pip Addresses, then the source host can check the validity of the new Pip Address by sending a PCMP Mobile Host query to the Mobile Address Server learned from DNS. The set of Pip Addresses learned from the Mobile Address Pip WG, Expires Aug. 15, 1993 [Page 48] INTERNET-DRAFT Pip Near-term Arch February 1993 Server is then used for subsequent validation. 15. Public Data Network (PDN) Address Discovery One of the problems with running Pip (or any internet protocol) over a PDN is that of the PDN entry Pip System discovering the PDN Address of the appropriate PDN exit Pip System. This problem is solved using ARP in small, broadcast LANs because the broadcast mechanism is rela- tively cheap. This solution is not available in the PDN case, where the number of attached systems is very large, and where broadcast is not available (or is not cheap if it is). For the case where the domain of the destination host is attached to a PDN, the problem is nicely solved by distributing the domain's exit PDN Address information in DNS, and then having the source host con- vey the exit PDN Address to the PDN entry router. The DNS of the destination host's domain contains the PDN Addresses for the domain. When DNS returns a record for the destination host, the record associates zero or more PDN Addresses with each Pip Address. The top-level prefix of the Pip Address is that of the PDN that the PDN Addresses apply to. (Note that, while the returned DNS record associates the PDN Addresses with a single Pip Address, in general the PDN Address will apply to a set of Pip Addresses--those for all hosts in the domain. The DNS files are structured to reflect this grouping in the same way that a single Pip Address prefix in DNS applies to many hosts. There- fore, every individual host entry in the DNS files does not need to have separate PDN Addresses typed in with it. This simplifies confi- guration of DNS.) When the source host sends the first packet to a given destination host, it attaches the PDN Address to the packet in a transit option. (Note that, because of the way that options are processed in Pip packets, no router other than the entry PDN router need look at the option.) When the entry router receives this packet, it determines that it is the entry router based on the result of the Routing Direc- tive lookup. It retrieves the PDN Address from the transit option, and caches it locally. The cache entry can later by retrieved using either the Pip WG, Expires Aug. 15, 1993 [Page 49] INTERNET-DRAFT Pip Near-term Arch February 1993 destination Pip ID or the destination Pip Address as the cache index. The entry router sends the source host a PCMP Exit PDN Address mes- sage indicating that it has cached the information. If there are multiple exit PDN Addresses, then the source host can at this time inform the entry PDN router of all the PDN addresses. The entry PDN router can either choose from these to setup a connection, or cache them to recover from the case where the existing connection breaks. Finally, the entry PDN router delivers the Pip packet (perhaps by setting up a connection) to the PDN Address indicated. When an PDN entry router receives a Pip packet for which it doesn't know the exit PDN address (and has no other means of determining it, such as shortcut routing), it sends a PCMP Exit PDN Address query message to the originating host. This can happen if, for instance, routing changes and directs the packets to a new PDN entry router. When the source host receives the PCMP Exit PDN Address query mes- sage, it transmits the PDN Addresses to the entry PDN router. 15.1. Notes on Carrying PDN Addresses in NSAPs The Pip use of PDN Address carriage in the transit option or PCMP Exit PDN Address message solves two significant problems associated with the analogous use of PDN Address-based NSAPs. First, there is no existing agreement (standards or otherwise) that the existence of of a PDN Address in an NSAP address implies that the identified host is reachable behind the PDN Address. Thus, upon receiving such an NSAP, the entry PDN router does not know for sure, without explicit configuration information, whether or not the PDN Address can be used at the lower layer. Solution of this problem requires standards body agreement, perhaps be setting aside addi- tional AFIs to mean "PDN Address with topological significance". The second, and more serious, problem is that a PDN Address in an NSAP does not necessarily scale well. This is best illustrated with the E.164 address. E.164 addresses can be used in many different network technologies--telephone network, BISDN, SMDS, Frame Relay, and other ATM. When a router receives a packet with an E.164-based NSAP, the E.164 address is in the most significant part of the NSAP address (that is, contains the highest level routing information). Pip WG, Expires Aug. 15, 1993 [Page 50] INTERNET-DRAFT Pip Near-term Arch February 1993 Thus, without a potentially significant amount of routing table information, the router does not know which network to send the packet to. Thus, unless E.164 addresses are assigned out in blocks according to provider network, it won't scale. A related problem is that of how an entry PDN router knows that the PDN address is meant for the PDN it is attached to or some other PDN. With Pip, there is a one-to-one relationship between Pip Address pre- fix and PDN, so it is always known. With NSAPs, it is not clear without the potentially large routing tables discussed in the previ- ous paragraph. 16. Evolution with Pip The fact that we call this architecture "near-term" implies that we expect it to evolve to other architectures. Thus it is important that we have a plan to evolve to these architectures. The Pip near- term architecture includes explicit mechanisms to support evolution. The key to evolution is being able to evolve any system at any time without destroying old functionality. Depending on what the new functionality is, it may be immediately useful to any system that installs, or it may not become useful until a significant number or even a majority of systems installs it. None-the-less, it is neces- sary to be able to install it piece-wise. The Pip protocol itself supports evolution through the following mechanisms [3]: 1. Tunneling. This allows more up-to-date routers to tunnel through less up-to-date routers, thus allowing for incremental router evolution. (Of course, by virtue of encapsulation, tunneling is always an evolution option, and indeed tunneling through IP is used in the Pip transition. However, Pip's tunneling encoding is efficient, and doesn't duplicate header information.) The only use for Pip tunneling in the Pip near-term architecture is to route packets through the internal routers of a transit domain when the internal routers have no external routing infor- mation. It is assumed that enhancements to the Pip Architecture that require tunneling will have their own means of indicating when forming a tunnel is necessary. Pip WG, Expires Aug. 15, 1993 [Page 51] INTERNET-DRAFT Pip Near-term Arch February 1993 2. Host independence from routing information. Since a host can receive packets without understand the routing content of the packet, routers can evolve without necessarily requiring hosts to evolve at the same pace. In order to allow hosts to send Pip packets without understand- ing the contents of the routing information (in the Transit Part), the Pip Header Server is able to "spoon-feed" the host the Pip header. If the Pip Header Server determines that the host is able to form its own Pip header (as will usually be the case with the near-term Pip architecture), the Pip Header is essentially a null function. It accepts a query from the host, passes it on to DNS, and returns the DNS information to the host. If the Pip Header Server determines that the host is not able to form its own Pip header, then the Pip Header Server forms one on behalf of the host. In one mode of operation, the Pip Header Server gives the host the values of some or all Transit Part fields, and the host constructs the Transit Part. This allows for evolution within the framework of the current Transit Part. In another mode, the Pip Header Server gives the host the Tran- sit Part as a simple bit field. This allows for evolution out- side the framework of the current Transit Part. In addition to the Pip Header Server being able to spoon-feed the host a Transit Part, routers are also able to spoon-feed hosts a Transit Part, in case the original Transit Part needs to be modified, using the PCMP Reformat Transit Part message. 3. Separation of handling from routing. This allows one aspect to evolve independently of the other. 4. Flexible Handling Directive, Routing Context, and Options defin- ition. This allows new handling, routing, and option types to be added and defunct ones to be removed over time (see section 16.1 below). 5. Fast and general options processing. Options processing in Pip is fast, both because not every router need look at every option, and because once a router decides it needs to look at an option, it can find it quickly (does not require a serial search). Thus the oft-heard argument that a new option can't be used because it will slow down processing in all routers goes Pip WG, Expires Aug. 15, 1993 [Page 52] INTERNET-DRAFT Pip Near-term Arch February 1993 away. Pip Options are essentially an extension of the Handling Direc- tive (HD). The HD is used when the handling type is common, and can be encoded in a small space. The option is used otherwise. It is possible that a future option will influence routing, and thus the Option will be an extension of the RD as well. The RD, however, is rich enough that this is unlikely. 6. Generalized Routing Directive. Because the Routing Directive is so general, it is more likely that we can evolve routing and addressing semantics without having to redefine the Pip header or the forwarding machinery. 7. Host version number. This number tells what Pip functions a host has, such as which PCMP message it can handle, so that an updated router can respond appropriately to a Pip packet received from a remote host. This supports the capability for routers to evolve ahead of hosts. (All Pip hosts will at least be able to handle all Pip near-term architecture functions.) The Host version number is also used by the Pip Header Server to determine the extent to which the Pip Header Server needs to format a header on behalf of the host. 8. Generalized Route Types. The MLPV routing algorithm is generic with regards to the types of routes it can calculate. Thus, adding new route types is a matter of configuring routers to accept the new route type, defining metrics for the new route type, and defining criteria for selecting one route of the new type over another. Note that none of these evolution features of Pip significantly slow down Pip header processing (as compared to other internet protocols). 16.1. Handling Directive (HD) and Routing Context (RC) Evolution Because the HD and RC are central to handling and routing of a Pip packet, the evolution of these aspects deserves more discussion. Both the HD and the RC fields contain multiple parameters. (In the Pip WG, Expires Aug. 15, 1993 [Page 53] INTERNET-DRAFT Pip Near-term Arch February 1993 case of the RC, the router treats the RC field as a single number, that is, ignores the fact that the RC is composed of multiple parame- ters.) These HD and RC multiple parameters may be arranged in any fashion (can be any length, subject to the length of the HD and RC fields themselves, and can fall on arbitrary bit boundaries). Associated with the HD and RC are "Contents" fields that indicate what parameters are in the HD and RC fields, and where they are. (The Contents fields are basically version numbers, except that a higher "version" number is not considered to supersede a lower one.) (Typical types of parameters are Address Family, TOS value, Queueing Priority, and so on.) The Contents field is a single number, the value of which indicates the parameter set. The mapping of Contents field value to parameter set is configured manually. The first bit of the Contents field indicates whether the Contents value is globally unique or locally unique. This allows for special- ized local parameter types. If the Contents field value is global, then all Pip systems must agree on the mapping of Contents field value to parameter set. If the Contents field value is local, then all Pip systems within a domain (as identified by either Pip Address prefix or Pip ID prefix) must agree on the mapping. The near-term Pip Architecture does not define any local Contents field values. The procedure for establishing new HD or RC parameter sets (or, eras- ing old ones) is as follows. Some organization defines the new parameter set. This may involve defining a new parameter. If it does, then the new parameter is described as a Pip Object. A Pip Object is nothing more than a number space used to unambiguously identify a new parameter type, and a character string that describes it [10]. Thus, the new parameter set is described as a list of Pip Objects, and the bit locations in the HD/RC that each Pip Object occupies. The organization that defines the parameter set submits it for an official Contents field value. (It would be submitted to the stan- dards body that has authority over Pip, currently the IAB.) If the new parameter set is approved, it is given a value, and that value is published in a well known place (an RFC). Of course, network administrators are free to install or not install the new parameter set in their hosts and routers. In the case of a new RC parameter set, installation of the new parameter set does not Pip WG, Expires Aug. 15, 1993 [Page 54] INTERNET-DRAFT Pip Near-term Arch February 1993 necessarily require any new software, because any Pip routing proto- col, such as MLPV, is able to find routes according to the new param- eter set by appropriate configuration of routers. In the case of a new HD parameter set, however, new software is necessary--to execute the new handling. For new HD and RC parameters sets, systems that do not understand the new parameter set can still be configured to execute one of several default actions on the new parameter. These default action allow for some control over how new functions are introduced into Pip systems. The default actions are: 1. Ignore the unknown parameter, 2. Set unknown parameter to all 0's, 3. Set unknown parameter to all 1's, 4. Silently discard packet, 5. Discard packet with PCMP Parameter Unknown. Action 1 is used when it doesn't much matter if previous systems on a path have acted on the parameter or not. Actions 2 and 3 are used when systems should know whether a previous system has not understood the parameter. Actions 4 and 5 are used when something bad happens if not all systems understand the new parameter. 16.1.1. Options Evolution The evolution of Options is very similar to that of the HD and RC. Associated with the Options is an Options Present field that indi- cates in a single word which of up to 8 options are present in the Options Part. There is a Contents field associated with the Options Present field that indicates which subset of all possible options the Options Present field refers to. Contents field values are assigned in the same way as for the HD and RC Contents fields. The same 5 default actions used for the HD and RC also apply to the Options. Pip WG, Expires Aug. 15, 1993 [Page 55] INTERNET-DRAFT Pip Near-term Arch February 1993 References [1] Pip Overview (Internet Draft) [2] Pip DNS Spec (Internet Draft) [3] Pip Forwarding Spec (Internet Draft) [4] Pip Address Assignment Spec (to be completed) [5] Pip ID Assignment Spec (Internet Draft) [6] Pip Assigned Numbers (to be completed) [7] Pip Header Protocol (to be completed) [8] Pip Control Message Protocol (PCMP) (to be completed) [9] Pip Router Discovery Protocol (to be completed) [10] Pip Objects Spec (Internet Draft) [11] Multi-level Path Vector Routing Protocol (to be completed) Pip WG, Expires Aug. 15, 1993 [Page 56]