P2PSIP Working Group J. Peng Internet-Draft China Mobile Intended status: Standards Track K. Feng Expires: January 3, 2012 Beijing University of Posts and Telecommunications L. Le China Mobile July 2, 2011 One Hop Lookups Algorithm Plugin for RELOAD draft-peng-p2psip-one-hop-plugin-00 Abstract This document proposes an implementation of overlay plugin algorithm which is called one hop lookups to provide examples and references for the research of one hop based RELOAD. With the development of the real time communications, there are high demands for the improvement of routing efficiency. In the one hop algorithm, each peer maintains complete membership information which can guarantee one hop lookups to improve the routing efficiency. For the RELOAD, using the one hop lookups algorithm to construct Topology Plugin with the same RELOAD core can have a better support for VoIP applications, and the implementation of one hop lookups algorithm plugin is based on the methods provided by RELAOD. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on January 3, 2012. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. Peng, et al. Expires January 3, 2012 [Page 1] Internet-Draft One Hop Lookups Plugin July 2011 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Hash functions . . . . . . . . . . . . . . . . . . . . . . . . 6 4. Peer data structure . . . . . . . . . . . . . . . . . . . . . 7 5. Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 6. Joining . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 7. Updates . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 7.1. Update messages definition . . . . . . . . . . . . . . . . 12 7.2. Primary event notification messages . . . . . . . . . . . 17 7.3. Actions after receiving update messages . . . . . . . . . 18 8. Stabilization . . . . . . . . . . . . . . . . . . . . . . . . 20 9. Leaving . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 10. Leader choosing strategies . . . . . . . . . . . . . . . . . . 22 11. Replication . . . . . . . . . . . . . . . . . . . . . . . . . 23 12. Fault tolerance . . . . . . . . . . . . . . . . . . . . . . . 24 12.1. First hop fail . . . . . . . . . . . . . . . . . . . . . . 24 12.2. Leaders fail . . . . . . . . . . . . . . . . . . . . . . . 24 13. Security Considerations . . . . . . . . . . . . . . . . . . . 26 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 28 16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 29 16.1. Normative References . . . . . . . . . . . . . . . . . . . 29 16.2. Informative References . . . . . . . . . . . . . . . . . . 29 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 30 Peng, et al. Expires January 3, 2012 [Page 2] Internet-Draft One Hop Lookups Plugin July 2011 1. Introduction RELOAD [I-D.ietf-p2psip-base] is a peer-to-peer (P2P) signaling protocol for use on the Internet. The Architecture of RELOAD has a routing layer which is called the Topology Plugin. It is responsible for implementing the specific overlay algorithm being used. In the RELOAD base protocol, the Topology Plugin is described by Chord, and we define another DHT algorithm in order to provide a high rouging efficiency which can be used in real time communications for a better performance. The real time communication has a high demand on the routing efficiency, for example, the VoIP usage on the application level of RELOAD, it is no doubt that the routing speed or efficiency must be very important, and it depends on the looking up efficiency of the Overlay Topology. There are many kinds of structured peer-to-peer overlay network algorithms like Chord, Pastry, CAN and so on. Most of them have one in common is that they have a routing table which can maintain a small amount of routing state. But all these algorithms need nearly O (log N) steps to find one peer or resource, if there are a large number of peers in the overlay, it costs a lot time to locate peer or resource which may have a bad influence on the application level of RELOAD. The one hop lookups algorithm gives one way to maintain complete membership information at each node in order to increase the routing efficiency even though the system is large and membership is changing rapidly. In the one hop algorithm, each peer maintains a complete description of system memberships, and it presents techniques for the peers to maintain this information accurately with lower costs of communications. Different one hop algorithm implementations have different ways to keep the accuracy of routing states. For example, the one hop lookups for peer-to-peer overlay [One-Hop-Lookups] uses event notification mechanism without broadcast to update all the peers' routing table during several seconds; the SandStone [SandStone] gives a way to collect update information by SPM (Super Peer Maintenance) which is called SPM assisted one hop enhancement. The SandStone overlay is typically deployed as a two-layered DHT, including a global DHT and several regional DHT. There is at least one SPM node in each region, and once one peer's state change is detected by its neighboring peer, the neighbor will notify its local SPM, and then the SPM will disseminate the event to other SPMs, finally each SPM will broadcast this notification to all peers under the charge of it. We use the one hop lookups algorithm to construct the RELOAD Topology Plugin because there is no SPM like peers and two layered topology in RELOAD base. In this draft, we first give a simple overview of one hop lookups Peng, et al. Expires January 3, 2012 [Page 3] Internet-Draft One Hop Lookups Plugin July 2011 algorithm with some definitions, then fill the overlay specific data structures into the RELOAD frame and implements this algorithm with topology messages defined in the RELOAD Topology Plugin; at last, the replication and fault tolerance strategies are stated. All of the definitions and implementations based on the requirements and methods provided by RELOAD. Peng, et al. Expires January 3, 2012 [Page 4] Internet-Draft One Hop Lookups Plugin July 2011 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Peng, et al. Expires January 3, 2012 [Page 5] Internet-Draft One Hop Lookups Plugin July 2011 3. Hash functions In this one hop lookups algorithm topology plugin, the size of the node ID and resource ID is 128 bits. The hash of an ID can be computed by SHA-1 [RFC3174] and other hash functions. Peng, et al. Expires January 3, 2012 [Page 6] Internet-Draft One Hop Lookups Plugin July 2011 4. Peer data structure A peer, or a node, is responsible for a particular Resource-ID which is less than or equal to its Node-ID and is greater than the Node-ID of the previous peer in the neighborhood. In order to have a high routing accuracy and efficiency, each peer must keep and maintain a series of data structure. In the one hop lookups algorithm, we divide the 128-bit circular identifier space into k equal contiguous intervals which are called slices. Each slice has a slice leader, which is chosen as the node that is the successor of the mid-point of the slice space (other choosing methods are list in Section X). Similarly, each slice also can be divided into u equal-sized intervals which are called units; each unit has a unit leader, which is chosen as the successor of the mid-point of the unit space. And of course every peer in the overlay should know its unit header and slice header. The slice leader and unit leader are both ordinary peers who are chosen dynamically, and they also can be configured at first. The leader choosing strategies will be stated in section 10. In the SandStone, there is at least one SPM node in each region, so when one peer detects the state change of overlay in its neighborhood, it will notify its local SPM as soon as possible. The local SPM then disseminate this notification to other SPMs, at last, each SPM will broadcast this update to all peers under the charge of it. The SPM is a special peer which can collect update information and do some other management tasks. The SPM does not participate in the routing procedures and it is pre-configured. On the other hand, the SPM uses the broadcast to spread the notifications. Only the bootstrap peer in RELOAD has the broadcast function. Using the broadcast is a simple and fast way to inform information, but it costs a lot of bandwidths and also may cause the notifications redundancy. Using the event based notification mechanism defined in one hop lookups can avoid redundancy in the communications: one peer will get information only from its neighbor that is one step closer to its unit leader. This implies that within a unit, information is always flowing from the unit leader to the ends of the unit. But its disadvantage is obvious: the notifications are spread too slowly which may cause the delay of overlay stabilization. In this draft, we choose the one hop lookups algorithm to be an example. From above, we can conclude the information peer should maintain as follows. Peng, et al. Expires January 3, 2012 [Page 7] Internet-Draft One Hop Lookups Plugin July 2011 Routing table: The routing table of each node keeps all of nodes' address, and then the source node can find the destination by just one hop at most if the items of routing table are right. So it is very important to keep the correctness of each node's full routing table. In order to achieve the goal, the notification of membership change events must reach every node in the overlay within a specified amount of time. Predecessor and successor information: The information especially Node-ID and address about peer's predecessor and successor must be kept in order to maintain the overlay circle. A peer will communicate with its predecessor and successor with keep-alive message (defined in Update message) every several seconds. Of course, this structure can be added as mark-ups or identifiers in the routing table. Unit leader information: The information about the unit leader who is responsible for the unit the peer belongs to. A peer can communicate with its unit leader to receive the Update message. Of course, this structure can be added as mark-ups or identifiers in the routing table. Slice leader information: The information about the slice leader who is responsible for the slice the peer belongs to. A peer can send event notification message (defined in Update message) to its slice leader to inform events. Of course, this structure can be added as mark-ups or identifiers in the routing table. The four kinds of information are maintained in all of peers, but there are still some kinds of other data structures should be kept in some special peers like unit leader and slice leader. Unit boundary identifier: This identifier does not keep in all the peers, it is stored on the peer who is the boundary of the unit in order to make sure the Update message just being transferred within unit. Unit boundary peer list: The unit leader should know the information about the unit's boundary peer in order to know its control area. Slice leader list: Every slice leader should know the information about all the slice leaders in the overlay in order to dispatch the event notification messages in time. Peng, et al. Expires January 3, 2012 [Page 8] Internet-Draft One Hop Lookups Plugin July 2011 Each peer with these data structures can compose a well running overlay topology based on the one hop lookups algorithm. Peng, et al. Expires January 3, 2012 [Page 9] Internet-Draft One Hop Lookups Plugin July 2011 5. Routing The routing table of the peer has full information of all the nodes in the overlay. If the peer is not responsible for a resource whose ID is r, it will query the routing table in order to find the first peer whose id is greater than r, which means that the peer should be responsible for this resource; and then sends a request message to the destination node. If no such node is found, it finds the largest Node-ID in the interval between the peer and Resource-ID r. Peng, et al. Expires January 3, 2012 [Page 10] Internet-Draft One Hop Lookups Plugin July 2011 6. Joining The joining process for a joining peer (JP) with Node-ID n is as follows. (1) JP MUST connect to its chosen or preconfigured bootstrap node. (2) JP SHOULD send an Attach request to its admitting peer (AP) for Node-ID n. The "send_update" flag should be used to acquire the routing table and other information for AP. (3) JP MUST send a Join to AP, the AP sends the response to the Join. (4) AP MUST do a series of Store requests to JP to store the data that JP will be responsible for. (5) JP SHOULD send Attach request to its predecessor in order to let it knows the arriving of new peer; the predecessor should update its successor information at once. (6) The predecessor detects a change in its routing table for example it has a new successor, it MUST send an event notification message which is a kind of Update messages to its slice leader directly. Of course, the Update message can also be sent by the AD. (7) The slice leader MUST collect all notifications it receives from the peers in his slice and sends Update messages to other slice leaders every several seconds to inform these events. (8) Other slice leaders MUST dispatch the Update messages they received to all the unit leaders in their respective slices. (9) A unit leader SHOULD piggyback this information on its Update message to its successor and predecessor. (10) Other nodes propagate this information from their successors; they SHOULD send it to their successors and vice versa. (11) Peers at unit boundaries SHOULD NOT send information to their neighboring peers outside their unit. The Update messages depending on the RELOAD message formats will be defined in the next section. The peer can find its successor by sending a Find request with the Resource-ID equals its Node-ID. Peng, et al. Expires January 3, 2012 [Page 11] Internet-Draft One Hop Lookups Plugin July 2011 7. Updates 7.1. Update messages definition The update messages in one hop lookups algorithm is based on the event notification. So the definition of update message can be started with the type definition. enum { eventUpdate (0), dataStructureUpdate (1), (255) } UpdateType; The UpdateType gives an enumeration of Update message which includes eventUpdate and dataStrucruteUpdate. The eventUpdate message will take a list of event notifications and the dataStructureUpdate message will take a list of data structure information which may include the routing table items. This structure allows for unknown options types. enum { keepAlive (0), ordinaryPeerChange (1), unitLeaderChange(2), sliceLeaderChange (3), (255) } EventType; The EventType gives an enumeration of event kinds, it is composed of keepAlive, ordinaryPeerChange, unitLeaderChange and sliceLeaderChange. The keepAlive event is like the Ping message in Chord. The ordinaryPeerChange, unitLeaderChange and sliceLeaderChange represent the state change of different peers in the overlay. This structure allows for unknown options types. enum { peerJoin (0), peerLeave (1), peerChange (2), (255) } ChangeType; The ChangeType gives an enumeration of peer state change type which includes the PeerJoin, PeerLeave and peerChange. The meaning of join and leave are easy to understand, and the peerChange means the unit leader change event or slice leader change event, it also can be constructed by peer leave and peer join. This structure allows for unknown options types. enum { ordinaryPeer (0), unitLeaderPeer (1), sliceLeaderPeer (2), (255) } PeerType; The PeerType gives an enumeration of peer type which is used to cooperate with the ChangeType. This structure allows for unknown options types. Peng, et al. Expires January 3, 2012 [Page 12] Internet-Draft One Hop Lookups Plugin July 2011 struct { Node-ID peer_id IpAddressPort peer_addr_port } PeerContactItem; The structure PeerContactItem is used to represent a basic item in the routing table or other data structures that need peers' information. It is composed by a Node-ID and IpAddressPort. The IpAddressPort structure is defined in the RELOAD base which includes one peer's IP address and listening port. If one peer get the information of another peer's PeerContactItem, then it can contact this peer with the IP address. Peng, et al. Expires January 3, 2012 [Page 13] Internet-Draft One Hop Lookups Plugin July 2011 struct { uint32 uptime; EventType event_type; PeerContactItem detect_peer; uint32 length; select (event_type) { case keepAlive: ; case ordinaryPeerChange: ChangeType ordinary_change_type; select (ordinary_change_type) { case peerJoin: PeerContactItem joining_peer; case peerLeave: PeerContactItem leaving_peer; } case unitLeaderChange: ChangeType unit_leader_change_type; select (unit_leader_change_type) { case peerJoin: PeerContactItem new_unit_leader; PeerContactItem old_unit_leader; case peerLeave: PeerContactItem leaving_unit_leade; } case sliceLeaderChange: ChangeType slice_leader_change_type; select (slice_leader_change_type) { case peerJoin: PeerContactItem new_slice_leader; PeerContactItem old_slice_leader; case peerLeave: PeerContactItem leaving_slice_leader; } } } EventNotification; The structure EventNotification is very important because it is the base of event notification. The length of the structure changes in different event type. This structure has enough information to describe some important events. The contact information of detect Peng, et al. Expires January 3, 2012 [Page 14] Internet-Draft One Hop Lookups Plugin July 2011 peer (detect_peer) is useless and can be ignored. In the event type, the three different event types can still be divided by the ChangeType. The notification must have enough information for all the peers in the overlay to update their data structures like routing table. struct { PeerType peer_type; PeerContactItem predecessor_peer; PeerContactItem successor_peer; PeerContactItem routing_table_list <0...n>; uint32 length; select (peer_type) { case ordinaryPeer: PeerContactItem unit_leader; PeerContactItem slice_leader; case unitLeaderPeer: PeerContactItem boundary_peers <2>; case sliceLeaderPeer: PeerContactItem slice_leaders_list <0...n>; } } DataStructureContent The structure DataStructureContent mainly used in the transferring of data structure during the peer joining procedure. The length of the structure changes in different peer type because different peer has different data structure. The ordinary peer only has predecessor, successor, unit leader, slice leader and routing table information. And the unit leader and slice leader should also know some other information. Peng, et al. Expires January 3, 2012 [Page 15] Internet-Draft One Hop Lookups Plugin July 2011 typedef EventNotification<0...n> EventNotificationList; typedef DataStructureContent<0...n> DataStructureContentList; struct { UpdateType update_type; uint32 length; select (update_type) { case eventUpdate: EventNotificationList event_notification_list; case dataStructureUpdate: DataStructureContentList data_structure_list; } } OneHopUpdateReq; This structure is the Update message used in the one hop lookups. The Update message is composed by two kinds of list. Using list can take more information. One of them is the EventNotificationList which includes a list of EventNotification structures, and the other is the DataStructureContentList which includes a list of DataStructureContent structures. All the peers in the overlay can use this Update request to inform event notification or transfer routing information. struct { uint32 update_response; } OneHopUpdateAns; The structure OneHopUpdateAns is used to be a response to the OneHopUpdateReq message. It is only has a response number to represent the receiver's attitude which may include success, fail and error. This number also can be defined as a type. The PeerContactItem structure contains parameters as follows. peer_id The peer's Node-ID. peer_addr_port The contact information of peer's especially the IP address. The EventNotification structure contains parameters as follows. Peng, et al. Expires January 3, 2012 [Page 16] Internet-Draft One Hop Lookups Plugin July 2011 uptime The time this peer has been up in seconds. event_type The type of event the message will take, this structure allows for unknown event types. length The length of the remainder of this message. ordinary_change_type, unit_leader_change_type, slice_leader_change_type The type of peer changing event, this structure allows for unknown peer changing types The DataStructureContent structure contains parameters as follows. peer_type The type of peer who wants to get these information, this structure allows for unknown peer types. routing_table_list The peer's routing table which has all of the peers' information in the overlay. boundary_peers The information of peers' who are the boundary of unit. slice_leaders_list All of the slice leaders' information will be stored in this list, and it can be transferred from one slice leader to another or from the old slice leader to the new one. The OneHopUpdateAns message contains parameters as follows. update_response The response of the Update message, different number has different meaning includes success, error and so on. 7.2. Primary event notification messages There are four main messages for the event notification among the one-hop lookups algorithm, the keep-alive messages, the event notification to slice leader messages, the messages exchanged between slice leaders and the messages from slice leaders to unit leaders. The keep-alive message is like the Ping defined in the Link Management of Reload, and the messages are used to form the base Peng, et al. Expires January 3, 2012 [Page 17] Internet-Draft One Hop Lookups Plugin July 2011 level communication between a peer and its predecessor and successor. The event notification message to slice leaders is used to send event notification to the peer's slice leader directly which can use the template of Update message. The messages exchanged between slice leaders are the most important. Each message sent from one slice leader to another batches together events that occurred during several seconds. We can still use the Update message to play this role. The messages from slice leaders to unit leaders can dispatch event notification. Messages received by a slice leader are batched for one second and the forwarded to unit leaders. As above, this message still can use Update message to express. Above all, these important event notification messages can be expressed by Update messages. In some scenarios like the peer joining, the Update message should be extended to support the transfer of the routing table information. 7.3. Actions after receiving update messages When peers in the overlay receive update messages which based on the event mechanism, the peer will check the type of event first. (1) If the message received is a keep-alive message, then the peer can make sure the relationship with the sender and return an answer as soon as possible. (2) If the message contains an ordinary peer join event, the actions taken should depend on the peer type of receiver, the slice leader will send node join notification to other slice leaders and its unit leader; the unit leader will send this notification to its peers; the peers will use the Update message to inform their processor or successor. (3) If the message contains an ordinary peer leave event, the actions taken should depend on the peer type of receiver, the slice leader will send node leave notification to other slice leaders and its unit leader; the unit leader will send this notification to its peers; the peers will use the Update message to inform their processor or successor. (4) If the message's event type is unit leader change, the actions taken should depend on the peer type of receiver, the slice leader will inform the sender to become the new unit leader, and then the new unit leader will spread this information to its unit members via the Update message. And the unit leader change message may be constructed by unit leader join and unit leader leave messages, but the procedure is same. Peng, et al. Expires January 3, 2012 [Page 18] Internet-Draft One Hop Lookups Plugin July 2011 (5) If the message's event type is slice leader change, the other slice leaders who receive this will record the event caused peer's information and return the acknowledge information, the new slice leader will inform its unit leaders to tell them the change of slice leader, and then the unit leaders will spread this information to its unit members via the Update message. Peng, et al. Expires January 3, 2012 [Page 19] Internet-Draft One Hop Lookups Plugin July 2011 8. Stabilization The stabilization of one hop lookups pay much more attention on the refreshing of routing table because if one peer has a right routing table, then it can contact any peers in the overlay directly. The accuracy of routing table depends on the speed of Update message spreading. So the notification message frequency is a very important case which can have effects on the stabilization. The event notification to slice leader, the messages exchanged between slice leaders and the messages from slice leaders to unit leaders have different sending frequency, and all of them are not fixed numbers. The frequency is calculated by the acceptable fraction of queries that fail in the first routing attempt, the peers in the overlay and the expected rate of membership changes. The details of frequency and time calculating are described in [One-Hop-Lookups]. Sometimes we can use the broadcast to inform notification in order to save time. On the other hand, keeping the accuracy of neighborhood peers' information is also very important. For example, in the one hop lookups, the keep-alive messages are sent every second, and the keep- alive messages construct the base level communications between a peer and its predecessor and successor. Peng, et al. Expires January 3, 2012 [Page 20] Internet-Draft One Hop Lookups Plugin July 2011 9. Leaving The leaving action is also a kind of event, so it can be expressed by the Update message. When a peer leaves the overlay peacefully, it needs to send an Update message as an event notification message, so the Leaving message may not be defined and just using the Update message. And the data transfer can also use the Update messages, but it sometimes depends on the replication strategies (Section 11). If a peer leaves the overlay without any notification, the neighbor peer (predecessor or successor) must inform this event to the leader by the Update messages, and then the backup peer of the leaving one should transfer the backup data to the new one who is charging the leaving peer's namespace. When an unit leader or slice leader leaves this overlay, the overlay should choose a new one to replace the old by the leader choosing strategies described in section 10. Peng, et al. Expires January 3, 2012 [Page 21] Internet-Draft One Hop Lookups Plugin July 2011 10. Leader choosing strategies The default slice and unit leader strategy is dynamic choosing the successor of the mid-point peer of the slice or unit space. It is convenient to manage and choose, but it is so simple that does not consider some other important cases like handling ability and so on. On the other hand, the slice leader can be treated as a "Super Node", and we can give some other super node choosing strategies as follows. On the other hand, a SandStone like schema can be used to choose leaders. In some scenarios, we can identify well connected and well provisioned peers as "Super Node" which may have the same meaning with SPM. The chosen super nodes can form a parallel ring. And of course the super slice will act as a slice leader; it does not participate in the routing procedure because its focus of working is on the collecting of event notifications and spreading them in time. The unit leader can be chosen by the default schema because the working burden of unit leader is much more relaxed than the slice leader, and if the chosen one has a poorly provisioned node with a low bandwidth connection to the Internet, the predecessor or successor of the chosen one can replace it. Peng, et al. Expires January 3, 2012 [Page 22] Internet-Draft One Hop Lookups Plugin July 2011 11. Replication The replica placement policy of the one hop algorithm can use the way defined in SandStone. It is designed to maximize data reliability and availability, and to maximize network bandwidth utilization. In the SandStone, the subscriber data can be replicated on multiple peers, three on default. We can use a SandStone like replica replacement policy with three replications. The first replica is stored in the primary peer's successor; the second is saved in a peer of different unit but in same slice; and the third is put in a peer from different slice. Above all, one of the most important issues is the choosing of the replica peer. Maybe we can use a special function whose input is peer's Node-ID to calculate the three replica peers, and of course the root peer should know about the replication method in order to be able to search the data. Peng, et al. Expires January 3, 2012 [Page 23] Internet-Draft One Hop Lookups Plugin July 2011 12. Fault tolerance 12.1. First hop fail In the one hop algorithm, if a query fails on its first attempt it does not return an error to an application. Instead, queries can be rerouted with the RELOAD message RouteQueryAns. We can define the RouteQueryAns as follows. struct { uint32 response_state; PeerContactItem redirect_peer; } RouteQueryAns; The structure RouteQueryAns is composed by a state number and a Node-ID. The state number tells the query state and the PeerContactItem gives a peer's information that is closer to the destination. And in most of cases, two hops are enough to locate one peer or resource. If the two hops are still wrong, then we can know the lookup procedure is fail. response_state The response mark of the RouteQueryReq message's; and it may include success, failure and errors. redirect_peer If the query is failed, the application can send RouteQueryReq message to this peer again for finding resources. If a lookup query from peer A to peer B because the routing table is outdate or the peer B has left, peer A can retry the query by sending the RouteQueryReq to the peer C who is the successor of peer B. And at that time joining a peer D to be peer B's new successor and peer C's predecessor, then peer C can reply RouteQueryAns message with the redirect_peer of peer D's, and peer A can contact it in this routing step. 12.2. Leaders fail The most important issues are how to keep the correct functioning of unit leaders and slice leaders; we need to recover from their failure. When a slice or unit leader fails, its successor soon detects the failure and becomes the new leader; the successor of a failed slice leader will communicate with its unit leaders and other slice leaders to recover information about the missed events. And of course, the leaser choosing strategies described in section 10 also can be used to choose a new leader. All of these events can be constructed by the Update message with the EventType (e.g. the Peng, et al. Expires January 3, 2012 [Page 24] Internet-Draft One Hop Lookups Plugin July 2011 UnitLeaderChange and SliceLeaderChange). Peng, et al. Expires January 3, 2012 [Page 25] Internet-Draft One Hop Lookups Plugin July 2011 13. Security Considerations Peng, et al. Expires January 3, 2012 [Page 26] Internet-Draft One Hop Lookups Plugin July 2011 14. IANA Considerations There are no IANA considerations associated to this memo. Peng, et al. Expires January 3, 2012 [Page 27] Internet-Draft One Hop Lookups Plugin July 2011 15. Acknowledgements Peng, et al. Expires January 3, 2012 [Page 28] Internet-Draft One Hop Lookups Plugin July 2011 16. References 16.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [I-D.ietf-p2psip-base] Jennings, C., Lowekamp, B., Rescorla, E., Baset, S., and H. Schulzrinne, "REsource LOcation And Discovery (RELOAD) Base Protocol", draft-ietf-p2psip-base-15 (work in progress), May 2011. 16.2. Informative References [RFC3174] Eastlake, D. and P. Jones, "US Secure Hash Algorithm 1 (SHA1)", RFC 3174, September 2001. [One-Hop-Lookups] Gupta, A., Liskov, B., and R. Rodrigues, "One Hop Lookups for Peer-to-Peer Overlays", June 2003. [SandStone] Shi, G., Chen, J., Gong, H., Fan, L., Xue, H., Lu, Q., and L. Liang, "SandStone: A DHT based Carrier Grade Distributed Storage System", September 2009. Peng, et al. Expires January 3, 2012 [Page 29] Internet-Draft One Hop Lookups Plugin July 2011 Authors' Addresses Jin Peng China Mobile Unit 2, 28 Xuanwumenxi Ave, Xuanwu District Beijing 100053 P.R.China Email: Penjin@chinamobile.com Kai Feng Beijing University of Posts and Telecommunications 10 Xi Tu Cheng Rd. Haidian District Beijing 100876 P.R.China Email: fengkai_sunny@139.com Lifeng Le China Mobile Unit 2, 28 Xuanwumenxi Ave, Xuanwu District Beijing 100053 P.R.China Email: Lelifeng@chinamobile.com Peng, et al. Expires January 3, 2012 [Page 30]