P2PSIP Working Group                                             J. Peng
Internet-Draft                                              China Mobile
Intended status: Standards Track                                 K. Feng
Expires: January 3, 2012                 Beijing University of Posts and
                                                      Telecommunications
                                                                   L. Le
                                                            China Mobile
                                                            July 2, 2011


              One Hop Lookups Algorithm Plugin for RELOAD
                  draft-peng-p2psip-one-hop-plugin-00

Abstract

   This document proposes an implementation of overlay plugin algorithm
   which is called one hop lookups to provide examples and references
   for the research of one hop based RELOAD.  With the development of
   the real time communications, there are high demands for the
   improvement of routing efficiency.  In the one hop algorithm, each
   peer maintains complete membership information which can guarantee
   one hop lookups to improve the routing efficiency.  For the RELOAD,
   using the one hop lookups algorithm to construct Topology Plugin with
   the same RELOAD core can have a better support for VoIP applications,
   and the implementation of one hop lookups algorithm plugin is based
   on the methods provided by RELAOD.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on January 3, 2012.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.



Peng, et al.             Expires January 3, 2012                [Page 1]

Internet-Draft           One Hop Lookups Plugin                July 2011


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  5
   3.  Hash functions . . . . . . . . . . . . . . . . . . . . . . . .  6
   4.  Peer data structure  . . . . . . . . . . . . . . . . . . . . .  7
   5.  Routing  . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
   6.  Joining  . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
   7.  Updates  . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
     7.1.  Update messages definition . . . . . . . . . . . . . . . . 12
     7.2.  Primary event notification messages  . . . . . . . . . . . 17
     7.3.  Actions after receiving update messages  . . . . . . . . . 18
   8.  Stabilization  . . . . . . . . . . . . . . . . . . . . . . . . 20
   9.  Leaving  . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
   10. Leader choosing strategies . . . . . . . . . . . . . . . . . . 22
   11. Replication  . . . . . . . . . . . . . . . . . . . . . . . . . 23
   12. Fault tolerance  . . . . . . . . . . . . . . . . . . . . . . . 24
     12.1. First hop fail . . . . . . . . . . . . . . . . . . . . . . 24
     12.2. Leaders fail . . . . . . . . . . . . . . . . . . . . . . . 24
   13. Security Considerations  . . . . . . . . . . . . . . . . . . . 26
   14. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 27
   15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 28
   16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 29
     16.1. Normative References . . . . . . . . . . . . . . . . . . . 29
     16.2. Informative References . . . . . . . . . . . . . . . . . . 29
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 30














Peng, et al.             Expires January 3, 2012                [Page 2]

Internet-Draft           One Hop Lookups Plugin                July 2011


1.  Introduction

   RELOAD [I-D.ietf-p2psip-base] is a peer-to-peer (P2P) signaling
   protocol for use on the Internet.  The Architecture of RELOAD has a
   routing layer which is called the Topology Plugin.  It is responsible
   for implementing the specific overlay algorithm being used.  In the
   RELOAD base protocol, the Topology Plugin is described by Chord, and
   we define another DHT algorithm in order to provide a high rouging
   efficiency which can be used in real time communications for a better
   performance.

   The real time communication has a high demand on the routing
   efficiency, for example, the VoIP usage on the application level of
   RELOAD, it is no doubt that the routing speed or efficiency must be
   very important, and it depends on the looking up efficiency of the
   Overlay Topology.  There are many kinds of structured peer-to-peer
   overlay network algorithms like Chord, Pastry, CAN and so on.  Most
   of them have one in common is that they have a routing table which
   can maintain a small amount of routing state.  But all these
   algorithms need nearly O (log N) steps to find one peer or resource,
   if there are a large number of peers in the overlay, it costs a lot
   time to locate peer or resource which may have a bad influence on the
   application level of RELOAD.  The one hop lookups algorithm gives one
   way to maintain complete membership information at each node in order
   to increase the routing efficiency even though the system is large
   and membership is changing rapidly.

   In the one hop algorithm, each peer maintains a complete description
   of system memberships, and it presents techniques for the peers to
   maintain this information accurately with lower costs of
   communications.  Different one hop algorithm implementations have
   different ways to keep the accuracy of routing states.  For example,
   the one hop lookups for peer-to-peer overlay [One-Hop-Lookups] uses
   event notification mechanism without broadcast to update all the
   peers' routing table during several seconds; the SandStone
   [SandStone] gives a way to collect update information by SPM (Super
   Peer Maintenance) which is called SPM assisted one hop enhancement.
   The SandStone overlay is typically deployed as a two-layered DHT,
   including a global DHT and several regional DHT.  There is at least
   one SPM node in each region, and once one peer's state change is
   detected by its neighboring peer, the neighbor will notify its local
   SPM, and then the SPM will disseminate the event to other SPMs,
   finally each SPM will broadcast this notification to all peers under
   the charge of it.  We use the one hop lookups algorithm to construct
   the RELOAD Topology Plugin because there is no SPM like peers and two
   layered topology in RELOAD base.

   In this draft, we first give a simple overview of one hop lookups



Peng, et al.             Expires January 3, 2012                [Page 3]

Internet-Draft           One Hop Lookups Plugin                July 2011


   algorithm with some definitions, then fill the overlay specific data
   structures into the RELOAD frame and implements this algorithm with
   topology messages defined in the RELOAD Topology Plugin; at last, the
   replication and fault tolerance strategies are stated.  All of the
   definitions and implementations based on the requirements and methods
   provided by RELOAD.













































Peng, et al.             Expires January 3, 2012                [Page 4]

Internet-Draft           One Hop Lookups Plugin                July 2011


2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].














































Peng, et al.             Expires January 3, 2012                [Page 5]

Internet-Draft           One Hop Lookups Plugin                July 2011


3.  Hash functions

   In this one hop lookups algorithm topology plugin, the size of the
   node ID and resource ID is 128 bits.  The hash of an ID can be
   computed by SHA-1 [RFC3174] and other hash functions.














































Peng, et al.             Expires January 3, 2012                [Page 6]

Internet-Draft           One Hop Lookups Plugin                July 2011


4.  Peer data structure

   A peer, or a node, is responsible for a particular Resource-ID which
   is less than or equal to its Node-ID and is greater than the Node-ID
   of the previous peer in the neighborhood.  In order to have a high
   routing accuracy and efficiency, each peer must keep and maintain a
   series of data structure.

   In the one hop lookups algorithm, we divide the 128-bit circular
   identifier space into k equal contiguous intervals which are called
   slices.  Each slice has a slice leader, which is chosen as the node
   that is the successor of the mid-point of the slice space (other
   choosing methods are list in Section X).  Similarly, each slice also
   can be divided into u equal-sized intervals which are called units;
   each unit has a unit leader, which is chosen as the successor of the
   mid-point of the unit space.  And of course every peer in the overlay
   should know its unit header and slice header.  The slice leader and
   unit leader are both ordinary peers who are chosen dynamically, and
   they also can be configured at first.  The leader choosing strategies
   will be stated in section 10.

   In the SandStone, there is at least one SPM node in each region, so
   when one peer detects the state change of overlay in its
   neighborhood, it will notify its local SPM as soon as possible.  The
   local SPM then disseminate this notification to other SPMs, at last,
   each SPM will broadcast this update to all peers under the charge of
   it.  The SPM is a special peer which can collect update information
   and do some other management tasks.  The SPM does not participate in
   the routing procedures and it is pre-configured.

   On the other hand, the SPM uses the broadcast to spread the
   notifications.  Only the bootstrap peer in RELOAD has the broadcast
   function.  Using the broadcast is a simple and fast way to inform
   information, but it costs a lot of bandwidths and also may cause the
   notifications redundancy.  Using the event based notification
   mechanism defined in one hop lookups can avoid redundancy in the
   communications: one peer will get information only from its neighbor
   that is one step closer to its unit leader.  This implies that within
   a unit, information is always flowing from the unit leader to the
   ends of the unit.  But its disadvantage is obvious: the notifications
   are spread too slowly which may cause the delay of overlay
   stabilization.  In this draft, we choose the one hop lookups
   algorithm to be an example.

   From above, we can conclude the information peer should maintain as
   follows.





Peng, et al.             Expires January 3, 2012                [Page 7]

Internet-Draft           One Hop Lookups Plugin                July 2011


   Routing table:
       The routing table of each node keeps all of nodes' address, and
       then the source node can find the destination by just one hop at
       most if the items of routing table are right.  So it is very
       important to keep the correctness of each node's full routing
       table.  In order to achieve the goal, the notification of
       membership change events must reach every node in the overlay
       within a specified amount of time.

   Predecessor and successor information:
       The information especially Node-ID and address about peer's
       predecessor and successor must be kept in order to maintain the
       overlay circle.  A peer will communicate with its predecessor and
       successor with keep-alive message (defined in Update message)
       every several seconds.  Of course, this structure can be added as
       mark-ups or identifiers in the routing table.

   Unit leader information:
       The information about the unit leader who is responsible for the
       unit the peer belongs to.  A peer can communicate with its unit
       leader to receive the Update message.  Of course, this structure
       can be added as mark-ups or identifiers in the routing table.

   Slice leader information:
       The information about the slice leader who is responsible for the
       slice the peer belongs to.  A peer can send event notification
       message (defined in Update message) to its slice leader to inform
       events.  Of course, this structure can be added as mark-ups or
       identifiers in the routing table.

   The four kinds of information are maintained in all of peers, but
   there are still some kinds of other data structures should be kept in
   some special peers like unit leader and slice leader.

   Unit boundary identifier:
       This identifier does not keep in all the peers, it is stored on
       the peer who is the boundary of the unit in order to make sure
       the Update message just being transferred within unit.

   Unit boundary peer list:
       The unit leader should know the information about the unit's
       boundary peer in order to know its control area.

   Slice leader list:
       Every slice leader should know the information about all the
       slice leaders in the overlay in order to dispatch the event
       notification messages in time.




Peng, et al.             Expires January 3, 2012                [Page 8]

Internet-Draft           One Hop Lookups Plugin                July 2011


   Each peer with these data structures can compose a well running
   overlay topology based on the one hop lookups algorithm.

















































Peng, et al.             Expires January 3, 2012                [Page 9]

Internet-Draft           One Hop Lookups Plugin                July 2011


5.  Routing

   The routing table of the peer has full information of all the nodes
   in the overlay.  If the peer is not responsible for a resource whose
   ID is r, it will query the routing table in order to find the first
   peer whose id is greater than r, which means that the peer should be
   responsible for this resource; and then sends a request message to
   the destination node.  If no such node is found, it finds the largest
   Node-ID in the interval between the peer and Resource-ID r.










































Peng, et al.             Expires January 3, 2012               [Page 10]

Internet-Draft           One Hop Lookups Plugin                July 2011


6.  Joining

   The joining process for a joining peer (JP) with Node-ID n is as
   follows.

   (1)   JP MUST connect to its chosen or preconfigured bootstrap node.

   (2)   JP SHOULD send an Attach request to its admitting peer (AP) for
         Node-ID n.  The "send_update" flag should be used to acquire
         the routing table and other information for AP.

   (3)   JP MUST send a Join to AP, the AP sends the response to the
         Join.

   (4)   AP MUST do a series of Store requests to JP to store the data
         that JP will be responsible for.

   (5)   JP SHOULD send Attach request to its predecessor in order to
         let it knows the arriving of new peer; the predecessor should
         update its successor information at once.

   (6)   The predecessor detects a change in its routing table for
         example it has a new successor, it MUST send an event
         notification message which is a kind of Update messages to its
         slice leader directly.  Of course, the Update message can also
         be sent by the AD.

   (7)   The slice leader MUST collect all notifications it receives
         from the peers in his slice and sends Update messages to other
         slice leaders every several seconds to inform these events.

   (8)   Other slice leaders MUST dispatch the Update messages they
         received to all the unit leaders in their respective slices.

   (9)   A unit leader SHOULD piggyback this information on its Update
         message to its successor and predecessor.

   (10)  Other nodes propagate this information from their successors;
         they SHOULD send it to their successors and vice versa.

   (11)  Peers at unit boundaries SHOULD NOT send information to their
         neighboring peers outside their unit.

   The Update messages depending on the RELOAD message formats will be
   defined in the next section.  The peer can find its successor by
   sending a Find request with the Resource-ID equals its Node-ID.





Peng, et al.             Expires January 3, 2012               [Page 11]

Internet-Draft           One Hop Lookups Plugin                July 2011


7.  Updates

7.1.  Update messages definition

   The update messages in one hop lookups algorithm is based on the
   event notification.  So the definition of update message can be
   started with the type definition.

   enum { eventUpdate (0), dataStructureUpdate (1), (255) }
     UpdateType;


   The UpdateType gives an enumeration of Update message which includes
   eventUpdate and dataStrucruteUpdate.  The eventUpdate message will
   take a list of event notifications and the dataStructureUpdate
   message will take a list of data structure information which may
   include the routing table items.  This structure allows for unknown
   options types.

   enum { keepAlive (0), ordinaryPeerChange (1),
     unitLeaderChange(2), sliceLeaderChange (3), (255) } EventType;


   The EventType gives an enumeration of event kinds, it is composed of
   keepAlive, ordinaryPeerChange, unitLeaderChange and
   sliceLeaderChange.  The keepAlive event is like the Ping message in
   Chord.  The ordinaryPeerChange, unitLeaderChange and
   sliceLeaderChange represent the state change of different peers in
   the overlay.  This structure allows for unknown options types.

   enum { peerJoin (0), peerLeave (1), peerChange (2), (255) }
     ChangeType;


   The ChangeType gives an enumeration of peer state change type which
   includes the PeerJoin, PeerLeave and peerChange.  The meaning of join
   and leave are easy to understand, and the peerChange means the unit
   leader change event or slice leader change event, it also can be
   constructed by peer leave and peer join.  This structure allows for
   unknown options types.

   enum { ordinaryPeer (0), unitLeaderPeer (1),
     sliceLeaderPeer (2), (255) } PeerType;


   The PeerType gives an enumeration of peer type which is used to
   cooperate with the ChangeType.  This structure allows for unknown
   options types.



Peng, et al.             Expires January 3, 2012               [Page 12]

Internet-Draft           One Hop Lookups Plugin                July 2011


   struct {
     Node-ID               peer_id
     IpAddressPort         peer_addr_port
   } PeerContactItem;


   The structure PeerContactItem is used to represent a basic item in
   the routing table or other data structures that need peers'
   information.  It is composed by a Node-ID and IpAddressPort.  The
   IpAddressPort structure is defined in the RELOAD base which includes
   one peer's IP address and listening port.  If one peer get the
   information of another peer's PeerContactItem, then it can contact
   this peer with the IP address.






































Peng, et al.             Expires January 3, 2012               [Page 13]

Internet-Draft           One Hop Lookups Plugin                July 2011


   struct {
     uint32                uptime;
     EventType             event_type;
     PeerContactItem       detect_peer;
     uint32                length;

     select (event_type) {
     case keepAlive:
       ;

     case ordinaryPeerChange:
       ChangeType          ordinary_change_type;
       select (ordinary_change_type) {
         case peerJoin:
           PeerContactItem       joining_peer;
         case peerLeave:
           PeerContactItem       leaving_peer;
       }

     case unitLeaderChange:
       ChangeType          unit_leader_change_type;
       select (unit_leader_change_type) {
         case peerJoin:
           PeerContactItem       new_unit_leader;
           PeerContactItem       old_unit_leader;
         case peerLeave:
           PeerContactItem       leaving_unit_leade;
       }

     case sliceLeaderChange:
       ChangeType          slice_leader_change_type;
       select (slice_leader_change_type) {
         case peerJoin:
           PeerContactItem       new_slice_leader;
           PeerContactItem       old_slice_leader;

         case peerLeave:
           PeerContactItem       leaving_slice_leader;
       }

     }
   } EventNotification;


   The structure EventNotification is very important because it is the
   base of event notification.  The length of the structure changes in
   different event type.  This structure has enough information to
   describe some important events.  The contact information of detect



Peng, et al.             Expires January 3, 2012               [Page 14]

Internet-Draft           One Hop Lookups Plugin                July 2011


   peer (detect_peer) is useless and can be ignored.  In the event type,
   the three different event types can still be divided by the
   ChangeType.  The notification must have enough information for all
   the peers in the overlay to update their data structures like routing
   table.

   struct {
     PeerType             peer_type;
     PeerContactItem      predecessor_peer;
     PeerContactItem      successor_peer;
     PeerContactItem      routing_table_list <0...n>;
     uint32               length;

     select (peer_type) {
     case ordinaryPeer:
       PeerContactItem      unit_leader;
       PeerContactItem      slice_leader;

     case unitLeaderPeer:
       PeerContactItem      boundary_peers <2>;

     case sliceLeaderPeer:
       PeerContactItem      slice_leaders_list <0...n>;
     }
   } DataStructureContent


   The structure DataStructureContent mainly used in the transferring of
   data structure during the peer joining procedure.  The length of the
   structure changes in different peer type because different peer has
   different data structure.  The ordinary peer only has predecessor,
   successor, unit leader, slice leader and routing table information.
   And the unit leader and slice leader should also know some other
   information.

















Peng, et al.             Expires January 3, 2012               [Page 15]

Internet-Draft           One Hop Lookups Plugin                July 2011


   typedef EventNotification<0...n>        EventNotificationList;
   typedef DataStructureContent<0...n>     DataStructureContentList;

   struct {
     UpdateType           update_type;
     uint32               length;

     select (update_type) {
     case eventUpdate:
       EventNotificationList       event_notification_list;

     case dataStructureUpdate:
       DataStructureContentList    data_structure_list;
     }
   } OneHopUpdateReq;


   This structure is the Update message used in the one hop lookups.
   The Update message is composed by two kinds of list.  Using list can
   take more information.  One of them is the EventNotificationList
   which includes a list of EventNotification structures, and the other
   is the DataStructureContentList which includes a list of
   DataStructureContent structures.  All the peers in the overlay can
   use this Update request to inform event notification or transfer
   routing information.

   struct {
     uint32          update_response;
   } OneHopUpdateAns;


   The structure OneHopUpdateAns is used to be a response to the
   OneHopUpdateReq message.  It is only has a response number to
   represent the receiver's attitude which may include success, fail and
   error.  This number also can be defined as a type.

   The PeerContactItem structure contains parameters as follows.

   peer_id
       The peer's Node-ID.

   peer_addr_port
       The contact information of peer's especially the IP address.

   The EventNotification structure contains parameters as follows.






Peng, et al.             Expires January 3, 2012               [Page 16]

Internet-Draft           One Hop Lookups Plugin                July 2011


   uptime
       The time this peer has been up in seconds.

   event_type
       The type of event the message will take, this structure allows
       for unknown event types.

   length
       The length of the remainder of this message.

   ordinary_change_type, unit_leader_change_type,
   slice_leader_change_type
       The type of peer changing event, this structure allows for
       unknown peer changing types

   The DataStructureContent structure contains parameters as follows.

   peer_type
       The type of peer who wants to get these information, this
       structure allows for unknown peer types.

   routing_table_list
       The peer's routing table which has all of the peers' information
       in the overlay.

   boundary_peers
       The information of peers' who are the boundary of unit.

   slice_leaders_list
       All of the slice leaders' information will be stored in this
       list, and it can be transferred from one slice leader to another
       or from the old slice leader to the new one.

   The OneHopUpdateAns message contains parameters as follows.

   update_response
       The response of the Update message, different number has
       different meaning includes success, error and so on.

7.2.  Primary event notification messages

   There are four main messages for the event notification among the
   one-hop lookups algorithm, the keep-alive messages, the event
   notification to slice leader messages, the messages exchanged between
   slice leaders and the messages from slice leaders to unit leaders.

   The keep-alive message is like the Ping defined in the Link
   Management of Reload, and the messages are used to form the base



Peng, et al.             Expires January 3, 2012               [Page 17]

Internet-Draft           One Hop Lookups Plugin                July 2011


   level communication between a peer and its predecessor and successor.
   The event notification message to slice leaders is used to send event
   notification to the peer's slice leader directly which can use the
   template of Update message.  The messages exchanged between slice
   leaders are the most important.  Each message sent from one slice
   leader to another batches together events that occurred during
   several seconds.  We can still use the Update message to play this
   role.  The messages from slice leaders to unit leaders can dispatch
   event notification.  Messages received by a slice leader are batched
   for one second and the forwarded to unit leaders.  As above, this
   message still can use Update message to express.

   Above all, these important event notification messages can be
   expressed by Update messages.  In some scenarios like the peer
   joining, the Update message should be extended to support the
   transfer of the routing table information.

7.3.  Actions after receiving update messages

   When peers in the overlay receive update messages which based on the
   event mechanism, the peer will check the type of event first.

   (1)  If the message received is a keep-alive message, then the peer
        can make sure the relationship with the sender and return an
        answer as soon as possible.

   (2)  If the message contains an ordinary peer join event, the actions
        taken should depend on the peer type of receiver, the slice
        leader will send node join notification to other slice leaders
        and its unit leader; the unit leader will send this notification
        to its peers; the peers will use the Update message to inform
        their processor or successor.

   (3)  If the message contains an ordinary peer leave event, the
        actions taken should depend on the peer type of receiver, the
        slice leader will send node leave notification to other slice
        leaders and its unit leader; the unit leader will send this
        notification to its peers; the peers will use the Update message
        to inform their processor or successor.

   (4)  If the message's event type is unit leader change, the actions
        taken should depend on the peer type of receiver, the slice
        leader will inform the sender to become the new unit leader, and
        then the new unit leader will spread this information to its
        unit members via the Update message.  And the unit leader change
        message may be constructed by unit leader join and unit leader
        leave messages, but the procedure is same.




Peng, et al.             Expires January 3, 2012               [Page 18]

Internet-Draft           One Hop Lookups Plugin                July 2011


   (5)  If the message's event type is slice leader change, the other
        slice leaders who receive this will record the event caused
        peer's information and return the acknowledge information, the
        new slice leader will inform its unit leaders to tell them the
        change of slice leader, and then the unit leaders will spread
        this information to its unit members via the Update message.













































Peng, et al.             Expires January 3, 2012               [Page 19]

Internet-Draft           One Hop Lookups Plugin                July 2011


8.  Stabilization

   The stabilization of one hop lookups pay much more attention on the
   refreshing of routing table because if one peer has a right routing
   table, then it can contact any peers in the overlay directly.  The
   accuracy of routing table depends on the speed of Update message
   spreading.  So the notification message frequency is a very important
   case which can have effects on the stabilization.  The event
   notification to slice leader, the messages exchanged between slice
   leaders and the messages from slice leaders to unit leaders have
   different sending frequency, and all of them are not fixed numbers.
   The frequency is calculated by the acceptable fraction of queries
   that fail in the first routing attempt, the peers in the overlay and
   the expected rate of membership changes.  The details of frequency
   and time calculating are described in [One-Hop-Lookups].  Sometimes
   we can use the broadcast to inform notification in order to save
   time.

   On the other hand, keeping the accuracy of neighborhood peers'
   information is also very important.  For example, in the one hop
   lookups, the keep-alive messages are sent every second, and the keep-
   alive messages construct the base level communications between a peer
   and its predecessor and successor.




























Peng, et al.             Expires January 3, 2012               [Page 20]

Internet-Draft           One Hop Lookups Plugin                July 2011


9.  Leaving

   The leaving action is also a kind of event, so it can be expressed by
   the Update message.  When a peer leaves the overlay peacefully, it
   needs to send an Update message as an event notification message, so
   the Leaving message may not be defined and just using the Update
   message.  And the data transfer can also use the Update messages, but
   it sometimes depends on the replication strategies (Section 11).

   If a peer leaves the overlay without any notification, the neighbor
   peer (predecessor or successor) must inform this event to the leader
   by the Update messages, and then the backup peer of the leaving one
   should transfer the backup data to the new one who is charging the
   leaving peer's namespace.  When an unit leader or slice leader leaves
   this overlay, the overlay should choose a new one to replace the old
   by the leader choosing strategies described in section 10.



































Peng, et al.             Expires January 3, 2012               [Page 21]

Internet-Draft           One Hop Lookups Plugin                July 2011


10.  Leader choosing strategies

   The default slice and unit leader strategy is dynamic choosing the
   successor of the mid-point peer of the slice or unit space.  It is
   convenient to manage and choose, but it is so simple that does not
   consider some other important cases like handling ability and so on.
   On the other hand, the slice leader can be treated as a "Super Node",
   and we can give some other super node choosing strategies as follows.

   On the other hand, a SandStone like schema can be used to choose
   leaders.  In some scenarios, we can identify well connected and well
   provisioned peers as "Super Node" which may have the same meaning
   with SPM.  The chosen super nodes can form a parallel ring.  And of
   course the super slice will act as a slice leader; it does not
   participate in the routing procedure because its focus of working is
   on the collecting of event notifications and spreading them in time.
   The unit leader can be chosen by the default schema because the
   working burden of unit leader is much more relaxed than the slice
   leader, and if the chosen one has a poorly provisioned node with a
   low bandwidth connection to the Internet, the predecessor or
   successor of the chosen one can replace it.






























Peng, et al.             Expires January 3, 2012               [Page 22]

Internet-Draft           One Hop Lookups Plugin                July 2011


11.  Replication

   The replica placement policy of the one hop algorithm can use the way
   defined in SandStone.  It is designed to maximize data reliability
   and availability, and to maximize network bandwidth utilization.  In
   the SandStone, the subscriber data can be replicated on multiple
   peers, three on default.  We can use a SandStone like replica
   replacement policy with three replications.  The first replica is
   stored in the primary peer's successor; the second is saved in a peer
   of different unit but in same slice; and the third is put in a peer
   from different slice.

   Above all, one of the most important issues is the choosing of the
   replica peer.  Maybe we can use a special function whose input is
   peer's Node-ID to calculate the three replica peers, and of course
   the root peer should know about the replication method in order to be
   able to search the data.


































Peng, et al.             Expires January 3, 2012               [Page 23]

Internet-Draft           One Hop Lookups Plugin                July 2011


12.  Fault tolerance

12.1.  First hop fail

   In the one hop algorithm, if a query fails on its first attempt it
   does not return an error to an application.  Instead, queries can be
   rerouted with the RELOAD message RouteQueryAns.  We can define the
   RouteQueryAns as follows.

   struct {
      uint32               response_state;
      PeerContactItem      redirect_peer;
   } RouteQueryAns;

   The structure RouteQueryAns is composed by a state number and a
   Node-ID.  The state number tells the query state and the
   PeerContactItem gives a peer's information that is closer to the
   destination.  And in most of cases, two hops are enough to locate one
   peer or resource.  If the two hops are still wrong, then we can know
   the lookup procedure is fail.

   response_state
       The response mark of the RouteQueryReq message's; and it may
       include success, failure and errors.

   redirect_peer
       If the query is failed, the application can send RouteQueryReq
       message to this peer again for finding resources.

   If a lookup query from peer A to peer B because the routing table is
   outdate or the peer B has left, peer A can retry the query by sending
   the RouteQueryReq to the peer C who is the successor of peer B. And
   at that time joining a peer D to be peer B's new successor and peer
   C's predecessor, then peer C can reply RouteQueryAns message with the
   redirect_peer of peer D's, and peer A can contact it in this routing
   step.

12.2.  Leaders fail

   The most important issues are how to keep the correct functioning of
   unit leaders and slice leaders; we need to recover from their
   failure.  When a slice or unit leader fails, its successor soon
   detects the failure and becomes the new leader; the successor of a
   failed slice leader will communicate with its unit leaders and other
   slice leaders to recover information about the missed events.  And of
   course, the leaser choosing strategies described in section 10 also
   can be used to choose a new leader.  All of these events can be
   constructed by the Update message with the EventType (e.g. the



Peng, et al.             Expires January 3, 2012               [Page 24]

Internet-Draft           One Hop Lookups Plugin                July 2011


   UnitLeaderChange and SliceLeaderChange).


















































Peng, et al.             Expires January 3, 2012               [Page 25]

Internet-Draft           One Hop Lookups Plugin                July 2011


13.  Security Considerations


















































Peng, et al.             Expires January 3, 2012               [Page 26]

Internet-Draft           One Hop Lookups Plugin                July 2011


14.  IANA Considerations

   There are no IANA considerations associated to this memo.
















































Peng, et al.             Expires January 3, 2012               [Page 27]

Internet-Draft           One Hop Lookups Plugin                July 2011


15.  Acknowledgements


















































Peng, et al.             Expires January 3, 2012               [Page 28]

Internet-Draft           One Hop Lookups Plugin                July 2011


16.  References

16.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [I-D.ietf-p2psip-base]
              Jennings, C., Lowekamp, B., Rescorla, E., Baset, S., and
              H. Schulzrinne, "REsource LOcation And Discovery (RELOAD)
              Base Protocol", draft-ietf-p2psip-base-15 (work in
              progress), May 2011.

16.2.  Informative References

   [RFC3174]  Eastlake, D. and P. Jones, "US Secure Hash Algorithm 1
              (SHA1)", RFC 3174, September 2001.

   [One-Hop-Lookups]
              Gupta, A., Liskov, B., and R. Rodrigues, "One Hop Lookups
              for Peer-to-Peer Overlays", June 2003.

   [SandStone]
              Shi, G., Chen, J., Gong, H., Fan, L., Xue, H., Lu, Q., and
              L. Liang, "SandStone: A DHT based Carrier Grade
              Distributed Storage System", September 2009.

























Peng, et al.             Expires January 3, 2012               [Page 29]

Internet-Draft           One Hop Lookups Plugin                July 2011


Authors' Addresses

   Jin Peng
   China Mobile
   Unit 2, 28 Xuanwumenxi Ave,
   Xuanwu District
   Beijing  100053
   P.R.China

   Email: Penjin@chinamobile.com


   Kai Feng
   Beijing University of Posts and Telecommunications
   10 Xi Tu Cheng Rd.
   Haidian District
   Beijing  100876
   P.R.China

   Email: fengkai_sunny@139.com


   Lifeng Le
   China Mobile
   Unit 2, 28 Xuanwumenxi Ave,
   Xuanwu District
   Beijing  100053
   P.R.China

   Email: Lelifeng@chinamobile.com





















Peng, et al.             Expires January 3, 2012               [Page 30]