Internet DRAFT - draft-schanzen-r5n
draft-schanzen-r5n
Independent Stream M. Schanzenbach
Internet-Draft Fraunhofer AISEC
Intended status: Informational C. Grothoff
Expires: 1 July 2023 Berner Fachhochschule
B. Fix
GNUnet e.V.
28 December 2022
The R5N Distributed Hash Table
draft-schanzen-r5n-01
Abstract
This document contains the R^5N DHT technical specification. R^5N is
a secure distributed hash table (DHT) routing algorithm and data
structure for decentralized applications. It features an open peer-
to-peer overlay routing mechanism which supports ad-hoc
permissionless participation and support for topologies in
restricted-route environments.
This document defines the normative wire format of protocol messages,
routing algorithms, cryptographic routines and security
considerations for use by implementers.
This specification was developed outside the IETF and does not have
IETF consensus. It is published here to guide implementation of R^5N
and to ensure interoperability among implementations including the
pre-existing GNUnet implementation.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 1 July 2023.
Schanzenbach, et al. Expires 1 July 2023 [Page 1]
Internet-Draft The R5N Distributed Hash Table December 2022
Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Requirements Notation . . . . . . . . . . . . . . . . . . 4
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4. Underlay . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5. Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.1. Routing Table . . . . . . . . . . . . . . . . . . . . . . 10
5.2. Peer Discovery . . . . . . . . . . . . . . . . . . . . . 10
5.3. Peer Bloom Filter . . . . . . . . . . . . . . . . . . . . 11
5.4. Routing Functions . . . . . . . . . . . . . . . . . . . . 12
5.5. Pending Table . . . . . . . . . . . . . . . . . . . . . . 13
6. Message Processing . . . . . . . . . . . . . . . . . . . . . 14
6.1. Message components . . . . . . . . . . . . . . . . . . . 14
6.1.1. Header . . . . . . . . . . . . . . . . . . . . . . . 14
6.1.2. Flags . . . . . . . . . . . . . . . . . . . . . . . . 15
6.1.3. Path Element . . . . . . . . . . . . . . . . . . . . 15
6.2. HelloMessage . . . . . . . . . . . . . . . . . . . . . . 20
6.2.1. Wire Format . . . . . . . . . . . . . . . . . . . . . 20
6.2.2. Processing . . . . . . . . . . . . . . . . . . . . . 21
6.3. PutMessage . . . . . . . . . . . . . . . . . . . . . . . 21
6.3.1. Wire Format . . . . . . . . . . . . . . . . . . . . . 22
6.3.2. Processing . . . . . . . . . . . . . . . . . . . . . 23
6.4. GetMessage . . . . . . . . . . . . . . . . . . . . . . . 25
6.4.1. Wire Format . . . . . . . . . . . . . . . . . . . . . 25
6.4.2. Result Filter . . . . . . . . . . . . . . . . . . . . 27
6.4.3. Processing . . . . . . . . . . . . . . . . . . . . . 27
6.5. ResultMessage . . . . . . . . . . . . . . . . . . . . . . 29
6.5.1. Wire Format . . . . . . . . . . . . . . . . . . . . . 29
6.5.2. Processing . . . . . . . . . . . . . . . . . . . . . 31
7. Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
7.1. Block Operations . . . . . . . . . . . . . . . . . . . . 33
7.2. HELLO Blocks . . . . . . . . . . . . . . . . . . . . . . 35
Schanzenbach, et al. Expires 1 July 2023 [Page 2]
Internet-Draft The R5N Distributed Hash Table December 2022
7.3. Persistence . . . . . . . . . . . . . . . . . . . . . . . 38
7.3.1. Approximate Search Considerations . . . . . . . . . . 39
7.3.2. Caching Strategy Considerations . . . . . . . . . . . 39
8. Security Considerations . . . . . . . . . . . . . . . . . . . 40
8.1. Approximate Result Filtering . . . . . . . . . . . . . . 40
8.2. Access control . . . . . . . . . . . . . . . . . . . . . 40
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 40
9.1. GNUnet URI Scheme Registration . . . . . . . . . . . . . 41
9.2. R5N URI Scheme Registration . . . . . . . . . . . . . . . 41
10. GANA Considerations . . . . . . . . . . . . . . . . . . . . . 41
10.1. Block Type Registry . . . . . . . . . . . . . . . . . . 41
10.2. GNUnet URI schema Subregistry . . . . . . . . . . . . . 42
10.3. GNUnet Signature Purpose Registry . . . . . . . . . . . 42
10.4. GNUnet Message Type Registry . . . . . . . . . . . . . . 42
11. Test Vectors . . . . . . . . . . . . . . . . . . . . . . . . 43
12. Normative References . . . . . . . . . . . . . . . . . . . . 43
13. Informative References . . . . . . . . . . . . . . . . . . . 44
Appendix A. Bloom filters in R^5N . . . . . . . . . . . . . . . 44
Appendix B. Overlay Operations . . . . . . . . . . . . . . . . . 46
B.1. GET operation . . . . . . . . . . . . . . . . . . . . . . 46
B.2. PUT operation . . . . . . . . . . . . . . . . . . . . . . 47
Appendix C. HELLO URLs . . . . . . . . . . . . . . . . . . . . . 48
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 49
1. Introduction
This specification describes the protocol of R^5N. R^5N is a
Distributed Hash Table (DHT) is an acronym for "randomized recursive
routing for restricted-route networks" and its first academic
description can be found in [R5N].
DHTs are a key data structure for the construction of decentralized
applications. and they generally provide a robust and efficient
means to distribute the storage and retrieval of key-value pairs.
The core idea behind R^5N is to combine an initial randomized routing
algorithm with an efficient, classical closest-peer algorithm. This
allows us to construct an algorithm that is able to escape and
circumvent restricted route environments while at the same time allow
for O(log n) routing complexity.
R^5N also includes advanced features like tracing paths messages take
through the network, response filters and on-path application-
specific data validation.
This document defines the normative wire format of peer-to-peer
messages, routing algorithms, cryptographic routines and security
considerations for use by implementors.
Schanzenbach, et al. Expires 1 July 2023 [Page 3]
Internet-Draft The R5N Distributed Hash Table December 2022
1.1. Requirements Notation
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
2. Terminology
Address is a UTF-8 [RFC3629] URI [RFC3986] which can be used as
address to contact a peer. An example of an addressing scheme
used in this document is "r5n+ip+tcp", which refers to a standard
TCP/IP socket connection. The "hier"-part of the URI must provide
a suitable address for the given addressing scheme. The following
is a non-normative example of address strings:
r5n+ip+udp://1.2.3.4:6789/
gnunet+tcp://12.3.4.5/
Figure 1: Example Address URIs.
Applications Applications are components which directly use the DHT
overlay interfaces. Possible applications include the GNU Name
System [I-D.schanzen-gns] and the CADET transport system [cadet].
Application API The application API exposes the core operations of
the DHT overlay to applications. This includes storing blocks in
the DHT and retrieving blocks from the DHT.
Block Variable-size unit of payload stored in the DHT under a Key.
Commonly also called a "value" when talking about a DHT as a "key-
value store".
Block Storage The Block Storage component is used to persist and
manage Block data by peers. It includes logic for enforcing
storage quotas, caching strategies and data validation.
Block-Type A unique 32-bit value identifying the data format of a
Block. Block-Types are either private or registered in the GANA
block type registry (see Section 10.1).
Initiator The peer that initially creates and sends a message
(Section 6.2, Section 6.3, Section 6.4, Section 6.5).
HELLO block A HELLO block is a block with a dedicated block type and
Schanzenbach, et al. Expires 1 July 2023 [Page 4]
Internet-Draft The R5N Distributed Hash Table December 2022
is specified in this document. The HELLO block is used to store
and retrieve Peer addresses. In this document, HELLO blocks are
used by the peer discovery mechanism.
HELLO URL HELLO URLs are URL-formatted HELLO blocks. They can used
for out-of-band exchanges of peer information and are used for
address update signalling messages to neighbours.
Key 512-bit identifier of a location in the DHT. Multiple Blocks
can be stored under the same key. Peer Addresses are valid keys.
Message Processing The Message Processing component processes
requests from and generates responses to applications and the
underlay network.
Neighbor A neighbor is a peer which is directly able to communicate
with our peer via the Underlay Interface.
Peer A host that is participating in the overlay. Peers are
responsible for holding some portion of the data that has been
stored in the overlay, and they are responsible for routing
messages on behalf of other hosts as needed by the Routing
Algorithm.
Peer Address The Peer Address is the identifier used on the Overlay
to address a peer. It is a SHA-512 hash of the Peer ID.
Peer ID The Peer ID is the public key which is used to authenticate
a peer in the underlay. The Peer ID is the public key of the
corresponding Ed25519[ed25519] peer private key.
Routing The Routing component includes the routing table as well as
routing and peer selection logic. It facilitates the R^5N routing
algorithm with required data structures and algorithms.
Responsible Peer The peer N that is responsible for a specific key
K, as defined by the SelectClosestPeer(K, P) algorithm (see
Section 5.
Underlay Interface The Underlay Interface is an abstraction layer on
top of the supported links of a peer. Peers may be linked by a
variety of different transports, including "classical" protocols
such as TCP, UDP and TLS or advanced protocols such as GNUnet, I2P
or Tor.
Schanzenbach, et al. Expires 1 July 2023 [Page 5]
Internet-Draft The R5N Distributed Hash Table December 2022
3. Overview
In R^5N peers communicate with each other in order to realize and
maintain two basic operations of a distributed hash table:
* PUT: This operation stores a block payload on one or more peers
with the goal of making the block availiable for queries using the
GET operation. In the classical definition of a dictionary
interface, this operation would be called "insert".
* GET: This operation queries the network of peers for blocks
previously stored under or near the key. In the classical
definition of a dictionary interface, this operation would be
called "find".
A peer or its implementation does not necessarily need to expose the
above operations to applications but it commonly will. For example,
the peer could be a server purely used for bootstrapping, routing or
supporting the overlay network with resources. An example for
possible semantics of the above operations provided as an API to
applications by an implementation are outlined in Appendix B.
In a trivial scenario where there is only one peer (the local host),
R^5N operates in a very similar fashion to a dictionary data
structure. However, the default use case is one where nodes
communicate directly and indirectly in order to realize a distributed
storage mechanism. This communication requires a lower-level peer
addressing and message transport mechanism such as TCP/IP. R^5N is
agnostic to the underlying transport protocol which is why this
document defines a common addressing and messaging interface in
Section 4. The interface provided by this underlay is used across
the specification of the R^5N protocol. It also serves as a set of
requirements of possible transport mechanisms that can be used to
implement R^5N with. That being said, common transport protocols
such as TCP/IP or UDP/IP and their interfaces are suitable R^5N
underlays used by existing implementations.
Specifics about the protocols of the underlays providing connectivity
or the applications using the DHT are out of the scope of this
document. However, we note that peers implementing disjoint sets of
underlay protocols may experience difficulties communicating (unless
other peers bridge the respective underlays). Similarly, peers that
do not support a particular application will not be able to validate
application-specific payloads and may thus be tricked into storing or
forwarding corrupt blocks.
Schanzenbach, et al. Expires 1 July 2023 [Page 6]
Internet-Draft The R5N Distributed Hash Table December 2022
In order to establish an initial connection to a network of R^5N
peers, an initial, addressable bootstrap peer is required. Further
peers, including neighbors, are then learned via a peer discovery
process as defined in Section 5.2.
Across this document, the functional components of an R^5N
implementation are divided into routing (Section 5), message
processing (Section 6) and block processing (Section 7).
Applications that require application-specific block payloads are
expected to register a block type in the GANA block type registry
(Section 10.1) and provide a specification of the associated block
operations (Section 7.1). to implementors of R^5N. Figure 2
illustrates the architectural overview of R^5N.
| +-----------------+ +-------+
Applications | | GNU Name System | | CADET | ...
| +-----------------+ +-------+
-------------+------------------------------------ Application API
| ^
| | +---------------+
| | | Block Storage |
| | +---------------+
| | ^
R5N | v v
| +--------------------+ +---------+
| | Message Processing |<-->| Routing |
| +--------------------+ +---------+
| ^ ^
| v v
-------------+------------------------------------ Underlay Interface
| +--------+ +--------+
| |GNUnet | |IP | ...
Connectivity | |Underlay| |Underlay|
| |Link | |Link |
| +--------+ +--------+
Figure 2: The R5N architecture.
4. Underlay
In the network underlay, a peer is addressable by traditional means
out of scope of this document. For example, the peer may have a TCP/
IP address, or a HTTPS endpoint. While the specific addressing
options and mechanisms are out of scope for this document, it is
necessary to define a universal addressing format in order to
facilitate the distribution of connectivity information to other
peers in the DHT overlay. This format is the "HELLO" Block
Schanzenbach, et al. Expires 1 July 2023 [Page 7]
Internet-Draft The R5N Distributed Hash Table December 2022
(described in Section 7.2), which contains URIs. The scheme of each
URI indicates which underlay understands the respective address given
in the rest of the URI.
It is expected that the underlay provides basic mechanisms to manage
peer connectivity and addressing. The required functionalities can
be represented by the following API:
TRY_CONNECT(N, A) A function which allows the local peer to attempt
the establishment of a connection to another peer N using an
address A. When the connection attempt is successful, information
on the new peer is offered through the PEER_CONNECTED signal.
HOLD(P) A function which tells the underlay to keep a hold on to a
connection to a peer P. Underlays are usually limited in the
number of active connections. With this function the DHT can
indicate to the underlay which connections should preferably be
preserved.
DROP(P) A function which tells the underlay to drop the connection
to a peer P. This function is only there for symmetry and used
during the peer's shutdown to release all of the remaining HOLDs.
As R^5N always prefers the longest-lived connections, it would
never drop an active connection that it has called HOLD() on
before. Nevertheless, underlay implementations should not rely on
this always being true. A call to DROP() also does not imply that
the underlay must close the connection: it merely removes the
preference to preserve the connection that was established by
HOLD().
SEND(P, M) A function that allows the local peer to send a protocol
message M to a peer P.
L2NSE = ESTIMATE_NETWORK_SIZE() A procedure that provides an
estimate of the network size. The result, L2NSE, must be the
base-2 logarithm of the estimated number of peers in the network.
It is used by the routing algorithm. If the underlay does not
support a protocol for network size estimation (such as cite paper
NSE) the value is assumed to be provided as a configuration
parameter to the implementation.
The above procedures are meant to be actively executed by the
implementation as part of the peer-to-peer protocol. In addition,
the underlay is expected to emit the following signals (usually
implemented as callbacks) based on network events observed by the
underlay implementation:
PEER_CONNECTED -> P is a signal that allows the DHT to react to a
Schanzenbach, et al. Expires 1 July 2023 [Page 8]
Internet-Draft The R5N Distributed Hash Table December 2022
newly connected peer P. Such an event triggers, for example,
updates in the routing table and gossiping of HELLOs to that peer.
PEER_DISCONNECTED -> P is a signal that allows the DHT to react to a
recently disconnected peer. Such an event triggers, for example,
updates in the routing table.
ADDRESS_ADDED -> A The underlay signals indicates that an address A
was added for our local peer and that henceforth the peer may be
reachable under this address. This information is used to
advertise connectivity information about the local peer to other
peers. A must be a URI suitable for inclusion in a HELLO payload
Section 7.2.
ADDRESS_DELETED -> A This underlay signals indicates that an address
A was removed from the set of addresses the local peer is possibly
reachable under. Addresses must have been added before they may
be deleted. This information is used to no longer advertise this
address to other peers.
RECEIVE -> (P, M) This signal informs the local peer that a protocol
message M was received from a peer P.
These signals then drive updates of the routing table, local storage
and message transmission.
5. Routing
In order to select peers which are suitable destinations for routing
messages, R^5N uses a hybrid approach: Given an estimated network
size N, the peer selection for the first N hops is random. After the
initial N hops, peer selection follows an XOR-based peer distance
calculation.
To enable routing, any R^5N implementation must keep information
about its current set of neighbors. Upon receiving a connection
notification from the Underlay through PEER_CONNECTED, information on
the new neighbor MUST be added to the routing table. Peers added to
the routing table SHOULD be signalled to the Underlay as important
connections using HOLD. Similarly when a disconnect is indicated by
the Underlay through PEER_DISCONNECTED messages for all addresses of
the peer it MUST be removed from the routing table.
In order to achieve O(log n) routing performance, the data structure
for managing neighbors and their metadata MUST be implemented using
the k-buckets concept of [Kademlia] as defined in Section 5.1.
Maintenance of the routing table (after bootstrapping) is described
in Section 5.2.
Schanzenbach, et al. Expires 1 July 2023 [Page 9]
Internet-Draft The R5N Distributed Hash Table December 2022
Unlike [Kademlia], routing decisions in R^5N are also influenced by a
Bloom filter in the message that prevents routing loops. This data
structure is discussed in Section 5.3. Section 5.4 describes the key
functions provided on top of these data structures.
5.1. Routing Table
Whenever a PEER_CONNECTED signal is received from the Underlay, the
respective peer is considered for insertion into the routing table.
The routing table consists of an array of k-buckets. Each k-bucket
contains a list of neighbors. The i-th k-bucket stores neighbors
whose peer IDs are between distance 2^i and 2^(i+1) from the local
peer. System constraints will typically force an implementation to
impose some upper limit on the number of neighbors kept per k-bucket.
Upon insertion, the implementation MUST call HOLD on the respective
connection.
Implementations SHOULD try to keep at least 5 entries per k-bucket.
Embedded systems that cannot manage this number of connections MAY
use connection-level signalling to indicate that they are merely a
client utilizing a DHT and not able to participate in routing. DHT
peers receiving such connections MUST NOT include connections to such
restricted systems in their k-buckets, thereby effectively excluding
them when making routing decisions.
If a system hits constraints with respect to the number of active
connections, an implementation MUST evict peers from those k-buckets
with the largest number of neighbors. The eviction strategy MUST be
to drop the shortest-lived connections first.
The implementation MAY cache valid HELLOs of disconnected peers
outside of the routing table and sporadically or periodically try to
(re-)establish connection to the peer by issuing TRY_CONNECT requests
on the Underlay.
5.2. Peer Discovery
Initially, the implementation depends upon either the Underlay
providing at least one initial connection to a peer (signalled
through PEER_CONNECTED), or the application/end-user providing at
least one working HELLO which is then in turn used to call
TRY_CONNECT on the Underlay in order to trigger a subsequent
PEER_CONNECTED signal from the Underlay. This is commonly achieved
through the configuration of hardcoded bootstrap peers or bootstrap
servers either for the Underlay or the R^5N implementation. While
details on how the first connection is established MAY depend on the
specific implementation, this SHOULD usually be done by an out-of-
band exchange of the information from a HELLO block.
Schanzenbach, et al. Expires 1 July 2023 [Page 10]
Internet-Draft The R5N Distributed Hash Table December 2022
Section Appendix C specifies a URL format for encoding HELLO blocks
as text strings which allow portable, human-readable, text-based
serialization format that can, for example, be encoded into a QR for
dissemination. HELLO URLs SHOULD be supported by implementations for
both import and export of HELLOs.
To discover peers for its routing table, a peer will initiate
GetMessage requests Section 6.4 asking for blocks of type HELLO using
its own peer address as QUERY_HASH. The PEER_BF is initialized and
set using the peers own peer address as well as the addresses of all
currently connected peers. These requests MUST use the
FindApproximate and DemultiplexEverywhere flags. FindApproximate
will ensure that other peers will reply with keys they merely
consider close-enough, while DemultiplexEverywhere will cause each
peer on the path to respond, which is likely to yield HELLO s of
peers that are useful somewhere in the routing table. The
RECOMMENDED replication level set in the REPL_LVL field is 4. The
size and format of the result filter is specified in Section 7.2.
The XQUERY is empty.
In order to facilitate the above, the Underlay is expected to provide
the implementation with one or more addresses signalled through
ADDRESS_ADDED. Zero addresses MAY be provided if a peer can only
establish outgoing connections and is otherwise unreachable. An
implementation MUST advertise its addresses periodically to its
neighbors through HelloMessages. The advertisement interval and
expiration should be configurable or chosen at the discretion of the
implementation based on external factors such as DHCP leases. The
specific frequency of advertisements MAY depend on available
bandwidth, the set of already connected neighbors, the workload of
the system and other factors which are at the discretion of the
developer, but SHOULD be a fraction of the expiration period.
Whenever a peer receives such a HELLO message from another peer that
is already in the routing table, it must cache it as long as that
peer is in its routing table (or until the HELLO expires) and serve
it in response to GET requests for HELLO blocks (see Section 6.4.3).
This behaviour makes it unnecessary to initiate dedicated PutMessages
containing HELLO blocks by the implementation.
5.3. Peer Bloom Filter
As DHT GetMessages and PutMessages traverse a random path through the
network for the first N hops, it is essential that routing loops are
avoided. This peer Bloom filter is constant in size at L=1024
buckets (128 bytes) and k=16 buckets per element. The peer Bloom
filter is part of the routing metadata in messages in order to
prevent circular routes and is updated at each hop with the hops peer
identity. For the next hop selection in both the random and the
Schanzenbach, et al. Expires 1 July 2023 [Page 11]
Internet-Draft The R5N Distributed Hash Table December 2022
deterministic case, any peer which is in the Bloom filter for the
respective message is not included in the peer selection process.
Any peer which is forwarding GetMessages or PutMessages (Section 6)
adds its own peer ID to the peer Bloom filter. This allows other
peers to (probabilistically) exclude already traversed peers when
searching for the next hops in the routing table.
The peer Bloom filter follows the definition in Appendix A. The set
of elements E consists of of all possible 256-bit peer IDs. The
mapping function M is defined as follows:
M(e) -> SHA-512 (e) as uint32[]
The element e is hashed using SHA-512. The resulting byte string is
interpreted as a string of k=16 32-bit integers in network byte order
which are used to set and check the bucket bits in B using BF-SET and
BF-TEST.
We note that the peer Bloom filter may exclude peers due to false-
postive matches. This is acceptable as routing should nevertheless
terminate (with high probability) in close vicinity of the key.
5.4. Routing Functions
Using the data structures described so far, the R^5N routing
component provides the following functions for message processing
(Section 6):
GetDistance(A, B) -> Distance as Integer This function calculates
the binary XOR between A and B. The resulting distance is
interpreted as an integer where the leftmost bit is the most
significant bit.
SelectClosestpeer(K, B) -> N This function selects the neighbor N
from our routing table with the shortest XOR-distance to the key
K. This means that for all other peers N' in the routing table
GetDistance(N, K) < GetDistance(N',K). Peers with a positive test
against the peer Bloom filter B are not considered.
SelectRandompeer(B) -> N This function selects a random peer N from
all neighbors. Peers with a positive test in the peer Bloom
filter B are not considered.
Selectpeer(K, H, B) -> N This function selects a neighbor N
depending on the number of hops H parameter. If H <
NETWORK_SIZE_ESTIMATE this function MUST return
SelectRandompeer(B) and SelectClosestpeer(K, B) otherwise.
Schanzenbach, et al. Expires 1 July 2023 [Page 12]
Internet-Draft The R5N Distributed Hash Table December 2022
IsClosestPeer(N, K, B) -> true | false This function checks if N is
the closest peer for K (cf. SelectClosestpeer(K)). Peers with a
positive test in the Bloom filter B are not considered.
ComputeOutDegree(REPL_LVL, HOPCOUNT, L2NSE) -> Number This function
computes the number of neighbors that a message should be
forwarded to. The arguments are the desired replication level
(REPL_LVL), the HOPCOUNT of the message so far and and the current
network size estimate (L2NSE) as provided by the underlay. The
result is the non-negative number of next hops to select. The
following figure gives the pseudocode for computing the number of
neighbors the peer should attempt to forward the message to.
function ComputeOutDegree(REPL_LVL, HOPCOUNT, L2NSE)
BEGIN
if (HOPCOUNT > L2NSE * 4)
return 0;
if (HOPCOUNT > L2NSE * 2)
return 1;
if (0 = REPL_LEVL)
REPL_LEVL = 1
if (REPL_LEVEL > 16)
REPL_LEVEL = 16
RM1 = REPL_LEVEL - 1
return 1 + (RM1 / (L2NSE + RM1 * HOPCOUNT))
Figure 3: Computing the number of next hops.
The above calculation may yield values that are not discrete.
Hence, the result MUST be rounded probabilistically to the nearest
discrete value, using the fraction as the probability for rounding
up. This probabillistic rounding is necessary to achieve the
statistically expected value of the replication level and average
number of peers a message is forwarded to.
5.5. Pending Table
R^5N performs stateful routing where the messages only carry the
query hash and do not encode the ultimate source or destination of
the request. Routing a request towards the key is doing hop-by-hop
using the routing table and the query hash. The pending table is
used to route responses back to the originator. In the pending table
each peer primarily associates a query hash with the associated
originator of the request. The pending table MUST store entries for
the last MAX_RECENT requests the peer has encountered. To ensure
that the peer does not run out of memory, information about older
requests is discarded. The value of MAX_RECENT MAY be configurable
and SHOULD be at least 128 * 10^3.
Schanzenbach, et al. Expires 1 July 2023 [Page 13]
Internet-Draft The R5N Distributed Hash Table December 2022
For each entry in the pending table, the DHT MUST track not only the
query key and the origin, but also the extended query, requested
block type and flags, and the result filter. If the query did not
provide a result filter, a fresh result filter MUST still be created
to filter duplicate replies. Details of how a result filter works
depend on the type, as described in Section 7.1.
When a second query from the same origin for the same query hash is
received, the DHT MUST attempt to merge the new request with the
state for the old request. If this is not possible, the existing
result filter MUST be discarded and replaced with the result filter
of the incoming message.
We note that for local applications, a fixed limit on the number of
concurrent requests may be problematic. Hence, it is RECOMMENDED
that implementations track requests from local applications
separately and preserve the information until the application
explicitly stops the request.
6. Message Processing
Further, the implementation MAY act as an initiator of messages. If
instructed through an application-facing API such as the one outlined
in Appendix B, the peer may acts as an initiator of GetMessages or
PutMessages. The status of initiator is relevant for peers when
processing ResultMessages and the potential handover of results to
the application.
The implementation MUST listen for RECEIVE(P, M) signals from the
Underlay and respond to the respective messages sent by the peer P.
Wheather initiated locally or received from a neighbour, the
implementation processes the messages according to the wire formats
and the required validations detailed in the following. Where
required, the local peer's ID is referred to as SELF.
6.1. Message components
This section describes some data structures and fields shared by
various message types.
6.1.1. Header
A message header that identifies the message length and type is
shared across all messages used in the R^5N protocol.
Schanzenbach, et al. Expires 1 July 2023 [Page 14]
Internet-Draft The R5N Distributed Hash Table December 2022
0 8 16 24
+-----+-----+-----+-----+
| MSIZE | MTYPE |
+-----+-----+-----+-----+
Figure 4: The common message header.
where:
MSIZE denotes the size of this message in network byte order.
MTYPE is the 16-bit message type. Message types are registered in
the GANA "GNUnet Message Type" registry Section 10.4.
6.1.2. Flags
Flags is a 16-bit vector representing binary options. Each flag is
represented by a bit in the field starting from 0 as the rightmost
bit to 15 as the leftmost bit.
0: DemultiplexEverywhere This bit indicates that each peer along the
way should process the request. If the bit is not set,
intermediate peers only route the message and only peers which
consider themselves closest to the key look for answers in their
local storage for GetMessages and cache the block in their local
storage for PutMessages and ResultMessages.
1: RecordRoute This bit indicates to keep track of the path that the
message takes in the P2P network.
2: FindApproximate This bit allows results where the key does not
match exactly.
3: Truncated This is a special flag which is set if a peer truncated
the path and thus the first hop on the path is given without a
signature to enable checking of the next signature. MUST never be
set in a query.
4-15: Reserved The remaining bits are reserved for future use and
MUST be set to 0 when initiating an operation. If non-zero bits
are received, implementations MUST preserve these bits when
forwarding messages.
6.1.3. Path Element
A Path Element represents a hop in the path a message has taken
through the network. The wire format of a Path Element is
illustrated in Figure 5.
Schanzenbach, et al. Expires 1 July 2023 [Page 15]
Internet-Draft The R5N Distributed Hash Table December 2022
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE |
| (64 byte) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER ID |
| (32 byte) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 5: The Wire Format of a Path Element.
where:
SIGNATURE is a 64 byte EdDSA signature using the current hop's
private key affirming the previous and next hops.
PEER ID is the EdDSA public key of the peer on the path.
An ordered list of Path Elements may be appended to any routed
PutMessages or ResultMessages. The signature of a Path Element is
created by the current hop after it made its routing decision
identifiying the successor peer.
Figure 6 shows the wire format of an example path from Peers A over B
and C as it would be received by D in the PUTPATH of a PutMessage or
the combined PUTPATH and GETPATH of a ResultMessage. The wire format
of the Path Elements allows a natural extension of the PUTPATH along
the route of the ResultMessage to the destination forming the
GETPATH. The PutMessage would indicate in the PATH_LEN field a
length of 3. The ResultMessage would indicate a path length of 3 as
the sum of the field values in PUTPATH_L and GETPATH_L.
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE A |
| (64 byte) |
| |
| |
| |
| |
Schanzenbach, et al. Expires 1 July 2023 [Page 16]
Internet-Draft The R5N Distributed Hash Table December 2022
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER A |
| (32 byte) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE B |
| (64 byte) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER B |
| (32 byte) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE C |
| (64 byte) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER C |
| (32 byte) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE D |
| (64 byte) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 6: Example of a path as found in PutMessages or
ResultMessages from A to D.
Schanzenbach, et al. Expires 1 July 2023 [Page 17]
Internet-Draft The R5N Distributed Hash Table December 2022
A path may be truncated in which case the signature of the truncated
Path Element is omitted leaving only the Peer ID required for the
verification of the subsequent Path Element signature. Such a
truncated path is indicated with the respective flag (Section 6.1.2).
The Peer ID of the last Path Element is omitted as it must be that of
the sender of the PutMesssage or ResultMessage. The wire format of a
truncated example path from Peers B over C to D is illustrated in
Figure 7. The wire format of an example path from Peers B over C as
it would be received by D in a PutMessage or ResultMessage is
illustrated in Figure 7. A ResultMessage would indicate in the
PATH_LEN field a length of 1. A PutMessage would indicate a length
of 1 as the sum of PUTPATH_L and GETPATH_L fields.
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER B |
| (32 byte) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE C |
| (64 byte) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER C |
| (32 byte) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE D |
| (64 byte) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 7: Example of a truncated path from Peer B to Peer D.
Schanzenbach, et al. Expires 1 July 2023 [Page 18]
Internet-Draft The R5N Distributed Hash Table December 2022
The SIGNATURE field in a Path Element covers a 64-bit
contextualization header, the the block expiration, a hash of the
block payload, as well as the predecessor peer ID and the peer ID of
the successor that the peer making the signature is routing the
message to. Thus, the signature made by SELF basically says that
SELF received the block payload from PEER PREDECESSOR and has
forwarded it to PEER SUCCESSOR. The wire format is illustrated in
Figure 8.
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIZE | PURPOSE |
+-----+-----+-----+-----+-----+-----+-----+-----+
| EXPIRATION |
+-----+-----+-----+-----+-----+-----+-----+-----+
| BLOCK HASH |
| (64 byte) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER PREDECESSOR |
| (32 byte) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER SUCCESSOR |
| (32 byte) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 8: The Wire Format of the Path Element for Signing.
SIZE A 32-bit value containing the length of the signed data in
bytes in network byte order. The length of the signed data MUST
be 144 bytes.
PURPOSE A 32-bit signature purpose flag. This field MUST be 6 (in
network byte order).
EXPIRATION denotes the absolute 64-bit expiration date of the block.
In microseconds since midnight (0 hour), January 1, 1970 UTC in
network byte order.
Schanzenbach, et al. Expires 1 July 2023 [Page 19]
Internet-Draft The R5N Distributed Hash Table December 2022
BLOCK HASH a SHA-512 hash over the block payload.
PEER PREDECESSOR the Peer ID of the previous hop. If the signing
peer initiated the PUT, this field is set to all zeroes.
PEER SUCCESSOR the Peer ID of the next hop (not of the signer).
6.2. HelloMessage
When the Underlay notifies the implementation of added or removed
addresses through ADDRESS_ADDED and ADDRESS_DELETED it MAY
disseminate those changes to neighbors using HelloMessages.
Initiation of HelloMessages by the implementation itself is
RECOMMENDED. HelloMessages are used to inform neighbors of a peer
about the sender's available addresses. The recipients use these
messages to inform their respective Underlays about ways to sustain
the connections and to generate HELLO blocks (see Section 7.2) to
answer peer discovery queries from other peers.
6.2.1. Wire Format
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| HEADER | RESERVED | URL_CTR |
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE /
/ (64 byte) |
+-----+-----+-----+-----+-----+-----+-----+-----+
| EXPIRATION |
+-----+-----+-----+-----+-----+-----+-----+-----+
/ ADDRESSES (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 9: The HelloMessage Wire Format.
where:
HEADER the common message header. Its MTYPE field must be set to
the value 157 in network byte order.
RESERVED is a 16-bit field that must be zero.
URL_CTR is a 16-bit number that gives the total number of addresses
encoded in the ADDRESSES field. In network byte order.
SIGNATURE is a 64 byte EdDSA signature using the sender's private
Schanzenbach, et al. Expires 1 July 2023 [Page 20]
Internet-Draft The R5N Distributed Hash Table December 2022
key affirming the information contained in the message. The
signature is signing exactly the same data that is being signed in
a HELLO block as described in Section 7.2.
EXPIRATION denotes the absolute 64-bit expiration date of the
content. The value specified is microseconds since midnight (0
hour), January 1, 1970, but must be a multiple of one million (so
that it can be represented in seconds in a HELLO URL). Stored in
network byte order.
ADDRESSES A sequence of exactly URL_CTR addresses (Section 2) which
can be used to contact the peer. Each address MUST be
0-terminated. The set of addresses MAY be empty.
6.2.2. Processing
If the initiator of a HelloMessage is SELF, the message is simply
sent to all neighbors P currently in the routing table using SEND.
Otherwise, upon receiving a HelloMessage from a peer P an
implementation MUST process it step by step as follows:
1. If P is not in its routing table, the message is discarded.
2. The signature is verified, including a check that the expiration
time is in the future. If the signature is invalid, the message
is discarded.
3. The information contained in the HelloMessage can be used to
synthesize a block of type HELLO (Section 7.2). The block is
cached in the routing table until it expires, the peer is removed
from the routing table, or the information is replaced by another
message from the peer. The implementation SHOULD instruct the
Underlay to connect to all now available addresses using
TRY_CONNECT in order to make the underlay aware of alternative
addresses for this connection and to maintain optimal
connectivity.
4. Received HelloMessages MUST NOT be forwarded.
6.3. PutMessage
PutMessages are used to store information at other peers in the DHT.
Any API which allows applications to initiate PutMessages needs to
provide sufficient, implementation-specific information needed to
construct the initial PutMessage. For example, implementations
supporting multiple applications and blocks will have block type and
message flag parameters in addition to the actual data payload and
Schanzenbach, et al. Expires 1 July 2023 [Page 21]
Internet-Draft The R5N Distributed Hash Table December 2022
key.
6.3.1. Wire Format
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| HEADER | BTYPE |
+-----+-----+-----+-----+-----+-----+-----+-----+
| FLAGS | HOPCOUNT | REPL_LVL | PATH_LEN |
+-----+-----+-----+-----+-----+-----+-----+-----+
| EXPIRATION |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER_BF /
/ (128 byte) |
+-----+-----+-----+-----+-----+-----+-----+-----+
| BLOCK_KEY /
/ (64 byte) |
+-----+-----+-----+-----+-----+-----+-----+-----+
/ TRUNCATED ORIGIN (0 or 32 bytes) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ PUTPATH (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ LAST HOP SIGNATURE (0 or 64 bytes) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ BLOCK (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 10: The PutMessage Wire Format.
where:
HEADER is the common message header. Its MTYPE field is set by the
initiator to the value 146 in network byte order. Read-only.
BTYPE is a 32-bit block type. The block type indicates the content
type of the payload. Set by the initiator. Read-only. In
network byte order.
FLAGS is a 16-bit vector with binary options (see Section 6.1.2).
Set by the initiator. Read-only.
HOPCOUNT is a 16-bit number indicating how many hops this message
has traversed to far. Set by the initiator to 0. Incremented by
processing peers. In network byte order.
REPL_LVL is a 16-bit number indicating the desired replication level
of the data. Set by the initiator. Read-only. In network byte
order.
Schanzenbach, et al. Expires 1 July 2023 [Page 22]
Internet-Draft The R5N Distributed Hash Table December 2022
PATH_LEN is a 16-bit number indicating the number of Path Elements
recorded in PUTPATH. As PUTPATH is optional, this value may be
zero. If the PUTPATH is enabled, set initially to 0 by the
initiator. Incremented by processing peers. In network byte
order.
EXPIRATION denotes the absolute 64-bit expiration date of the
content. Set by the initiator. Read-only. In microseconds since
midnight (0 hour), January 1, 1970 in network byte order.
PEER_BF A peer Bloom filter to stop circular routes (see
Section 5.3). Set by the initiator to contain the local peer and
all neighbors it is forwarded to. Modified by processing peers to
include their own peer ID using BF-SET.
BLOCK_KEY The key under which the PutMessage wants to store content
under. Set by the initiator. Read-only.
TRUNCATED ORIGIN is only provided if the TRUNCATED flag is set in
FLAGS. If present, this is the public key of the peer just before
the first entry on the PUTPATH and the first peer on the PUTPATH
is not the actual origin of the message. Thus, to verify the
first signature on the PUTPATH, this public key must be used.
Note that due to the truncation, this last hop cannot be verified
to exist. Value is modified by processing peers.
PUTPATH the variable-length PUT path. The path consists of a list
of PATH_LEN Path Elements. Set by the initiator to 0.
Incremented by processing peers.
LAST HOP SIGNATURE is only provided if the RECORD ROUTE flag is set
in FLAGS. If present, this is an EdDSA signature of the sender of
this message (using the same format as the signatures in PUTPATH)
affirming that the sender forwarded the message from the
predecessor (all zeros if PATH_LEN is 0, otherwise the last peer
in PUTPATH) to the target peer. Modified by processing peers (if
flag is set).
BLOCK the variable-length block payload. The contents are
determined by the BTYPE field. The length is determined by MSIZE
minus the size of all of the other fields. Set by the initiator.
Read-only.
6.3.2. Processing
Upon receiving a PutMessage from a peer P , or created through
initiation by an overlay API, an implementation MUST process it step
by step as follows:
Schanzenbach, et al. Expires 1 July 2023 [Page 23]
Internet-Draft The R5N Distributed Hash Table December 2022
1. The EXPIRATION field is evaluated. If the message is expired, it
MUST be discarded.
2. If the BTYPE is not supported by the implementation, no
validation of the block payload is performed and processing
continues at (5). If the BTYPE is ANY, then the message MUST be
discarded. Else, the block MUST be validated as defined in (3)
and (4).
3. The message is evaluated using the block validation functions
matching the BTYPE. First, the client attempts to derive the key
using the respective DeriveBlockKey procedure as described in
Section 7.1. If a key can be derived and does not match, the
message MUST be discarded.
4. Next, the ValidateBlockStoreRequest procedure for the BTYPE as
described in Section 7.1 is used to validate the block payload.
If the block payload is invalid, the message MUST be discarded.
5. The peer address of the sender peer P SHOULD be in PEER_BF. If
not, the implementation MAY log an error, but MUST continue.
6. If the RecordRoute flag is not set, the PATH_LEN MUST be set to
zero. If the flag is set and PATH_LEN is non-zero, the local
peer SHOULD verify the signatures from the PUTPATH. Verification
MAY involve checking all signatures or any random subset of the
signatures. It is RECOMMENDED that peers adapt their behavior to
available computational resources so as to not make signature
verification a bottleneck. If an invalid signature is found, the
PUTPATH MUST be truncated to only include the elements following
the invalid signature.
7. If the local peer is the closest peer (cf. IsClosestPeer(SELF,
BLOCK_KEY, PeerFilter)) or the DemultiplexEverywhere flag ist
set, the message SHOULD be stored locally in the block storage if
possible. The implementation MAY choose not store the block if
external factors or configurations prevent this, such as limited
(alottted) disk space.
8. If the BTYPE of the message indicates a HELLO block, the peer
MUST be considered for the local routing table by using the peer
address in BLOCK_KEY. If the peer is not either already
connected or the respective k-bucket is not already full the peer
MUST try to establish a connection to the peer indicated in the
HELLO block using the address information from the HELLO block
and the Underlay function TRY_CONNECT. The implementation MUST
instruct the Underlay to try to connect to all provided addresses
using TRY_CONNECT in order to make the underlay aware of multiple
Schanzenbach, et al. Expires 1 July 2023 [Page 24]
Internet-Draft The R5N Distributed Hash Table December 2022
addresses for this connection. When a connection is established,
the signal PEER_CONNECTED will cause the peer to be added to the
respective k-bucket of the routing table (Section 5).
9. Given the value in REPL_LVL, HOPCOUNT and the result of
IsClosestPeer(SELF, BLOCK_KEY, PeerFilter) the number of peers to
forward to MUST be calculated using ComputeOutDegree(). The
implementation SHOULD select up to this number of peers to
forward the message to. The implementation MAY forward to fewer
or no peers in order to handle resource constraints such as
limited bandwidth. For each selected peer with peer address P a
dedicated PutMessage_P is created containing the original (and
where applicable already updated) fields of the received
PutMessage. In each message the all selected addresses and the
local peer MUST be added to the PEER_BF and the HOPCOUNT is
incremented by 1. If the RecordRoute flag is set, a new Path
Element is created using the predecessor peer ID and the
signature of the current peer. The Path Element is added to the
PUTPATH fields and the PATH_LEN field is incremented by 1. When
creating the Path Element signature, the successor must be set to
the recipient peer P of the PutMessageP. The successor in the
new Path Element is the recipient peer P of Finally, the messages
are sent using SEND(P, PutMessageP) each recipient.
6.4. GetMessage
GetMessages are used to request information from other peers in the
DHT. Any overlay API which allows applications to initiate
GetMessages needs to provide sufficient, implementation-specific
information needed to construct the initial GetMessage. For example,
implementations supporting multiple applications and blocks will have
block type and message flag parameters.
6.4.1. Wire Format
Schanzenbach, et al. Expires 1 July 2023 [Page 25]
Internet-Draft The R5N Distributed Hash Table December 2022
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| HEADER | BTYPE |
+-----+-----+-----+-----+-----+-----+-----+-----+
| FLAGS | HOPCOUNT | REPL_LVL | RF_SIZE |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER_BF /
/ (128 byte) |
+-----+-----+-----+-----+-----+-----+-----+-----+
| QUERY_HASH /
/ (64 byte) |
+-----+-----+-----+-----+-----+-----+-----+-----+
| RESULT_FILTER /
/ (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ XQUERY (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 11: The GetMessage Wire Format.
where:
HEADER is the common message header. Its MTYPE field is set by the
initiator to the value 147 in network byte order. Read-only.
BTYPE is a 32-bit block type field. The block type indicates the
content type of the payload. Set by the initiator. Read-only.
In network byte order.
FLAGS is a 16-bit vector with binary options (see Section 6.1.2).
Set by the initiator. Read-only.
HOPCOUNT is a 16-bit number indicating how many hops this message
has traversed to far. Set by the initiator to 0. Incremented by
processing peers. In network byte order.
REPL_LVL is a 16-bit number indicating the desired replication level
of the data. Set by the initiator. Read-only. In network byte
order.
RF_SIZE is a 16-bit number indicating the length of the result
filter RESULT_FILTER. Set by the initiator. Read-only. In
network byte order.
PEER_BF A peer Bloom filter to stop circular routes (see
Section 5.3). Set by the initiator to include itself and all
connected neighbors in the routing table. Modified by processing
peers to include their own peer address.
Schanzenbach, et al. Expires 1 July 2023 [Page 26]
Internet-Draft The R5N Distributed Hash Table December 2022
QUERY_HASH The query used to indicate what the key is under which
the initiator is looking for blocks with this request. The block
type may use a different evaluation logic to determine applicable
result blocks. Set by the initiator. Read-only.
RESULT_FILTER the variable-length result filter, described in
Section 6.4.2. Set by the initiator. Modified by processing
peers.
XQUERY the variable-length extended query. Optional. Set by the
initiator. Read-only.
6.4.2. Result Filter
The result filter is used to indicate to other peers which results
are not of interest when processing a GetMessage (Section 6.4). Any
peer which is processing GetMessages and has a result which matches
the query key MUST check the result filter and only send a reply
message if the result does not test positive under the result filter.
Before forwarding the GetMessage, the result filter MUST be updated
using the result of the BTYPE-specific FilterResult (see Section 7.1)
function to filter out all results already returned by the local
peer.
How a result filter is implemented depends on the block type as
described in Section 7.1. Result filters may be probabilistic data
structures. Thus, it is possible that a desireable result is
filtered by a result filter because of a false-positive test.
How exactly a block result is added to a result filter is specified
as part of the definition of a block type (cf. Section 7.2).
6.4.3. Processing
Upon receiving a GetMessage from a peer P, or created through
initiation by the overlay API, an implementation MUST process it step
by step as follows:
1. If the BTYPE is supported, the QUERY_HASH and XQUERY fields are
validated as defined by the respective ValidateBlockQuery
procedure for this type. If the result yields REQUEST_INVALID,
the message MUST be discarded and processing ends. If the BTYPE
is not supported, the message MUST be forwarded (Skip to step 4).
If the BTYPE is ANY, the message is processed further without
validation.
Schanzenbach, et al. Expires 1 July 2023 [Page 27]
Internet-Draft The R5N Distributed Hash Table December 2022
2. The peer address of the sender peer P SHOULD be in the PEER_BF
Bloom filter. If not, the implementation MAY log an error, but
MUST continue.
3. The local peer SHOULD try to produce a reply in any of the
following cases: (1) If the local peer is the closest peer (cf.
IsClosestPeer (SELF, QueryHash, PeerFilter), or (2) if the
DemultiplexEverywhere flag is set, or (3) if the local peer is
not the closest and a previously cached ResultMessage also
matches this request (Section 6.5.2).
The reply is produced (if one is available) using the following
steps:
a) If the BTYPE is HELLO, the implementation MUST only consider
synthesizing its own addresses and the addresses it has
cached for the peers in its routing table as HELLO block
replies. Otherwise, if the BTYPE does not indicate a request
for a HELLO block or ANY, the implementation MUST only
consider blocks in the local block storage and previously
cached ResultMessages.
b) If the FLAGS field includes the flag FindApproximate, the
peer SHOULD respond with the closest block (smallest value of
GetDistance(QUERY_HASH, BLOCK_KEY)) it can find that is not
filtered by the RESULT_BF. Otherwise, the peer MUST respond
with the block with a BLOCK_KEY that matches the QUERY_HASH
exactly and that is not filtered by the RESULT_BF.
c) Any resulting (synthesized) block is encapsulated in a
ResultMessage. The ResultMessage SHOULD be transmitted to
the neighbor from which the request was received.
Implementations MAY not reply if they are resource-constrained.
However, ResultMessages MUST be given the highest priority among
competing transmissions.
If the BTYPE is supported and ValidateBlockReply for the given
query has yielded a status of FILTER_LAST, processing MUST end
and not continue with forwarding of the request to other peers.
4. The implementation SHOULD create (or merge) an entry in the
pending table Section 5.5 for the query represented by this
GetMessage. If the peer is unable to handle an additional entry
in the table, the message MUST be discarded and processing ends.
Schanzenbach, et al. Expires 1 July 2023 [Page 28]
Internet-Draft The R5N Distributed Hash Table December 2022
5. Using the value in REPL_LVL, the number of peers to forward to
MUST be calculated using ComputeOutDegree(). If there is at
least one peer to forward to, the implementation SHOULD select up
to this number of peers to forward the message to. The
implementation MAY forward to fewer or no peers in order to
handle resource constraints such as bandwidth. The peer Bloom
filter PEER_BF MUST be updated with the local peer address SELF
for any forwarded message. For all peers with peer address P
chosen to forward the message to, SEND(P, GetMessageP) is called.
Here, GetMessageP is the original message with the updated fields
for HOPCOUNT (incremented by 1), PEER_BF and RESULT_FILTER.
6.5. ResultMessage
ResultMessages are used to return information to other peers in the
DHT or to applications using the overlay API that previously
initiated a GetMessage. The initiator of a ResultMessage is a peer
triggered through the processing of a GetMessage.
6.5.1. Wire Format
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| HEADER | BTYPE |
+-----+-----+-----+-----+-----+-----+-----+-----+
| RESERVED | FLAGS | PUTPATH_L | GETPATH_L |
+-----+-----+-----+-----+-----+-----+-----+-----+
| EXPIRATION |
+-----+-----+-----+-----+-----+-----+-----+-----+
| QUERY_HASH /
/ (64 byte) |
+-----+-----+-----+-----+-----+-----+-----+-----+
/ TRUNCATED ORIGIN (0 or 32 bytes) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ PUTPATH /
/ (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ GETPATH /
/ (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ LAST HOP SIGNATURE (0 or 64 bytes) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ BLOCK /
/ (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 12: The ResultMessage Wire Format
Schanzenbach, et al. Expires 1 July 2023 [Page 29]
Internet-Draft The R5N Distributed Hash Table December 2022
where:
HEADER is the common message header. Its MTYPE field must be set to
the value 148 in network byte order. Set by the initiator. Read-
only.
BTYPE is a 32-bit block type field. The block type indicates the
content type of the payload. Set by the initiator. Read-only.
In network byte order.
RESERVED is a 16-bit value. Implementations MUST set this value to
zero when originating a result message. Implementations MUST
forward this value unchanged even if it is non-zero.
FLAGS is a 16-bit vector with binary options (see Section 6.1.2).
Set by the initiator.
PUTPATH_L is a 16-bit number indicating the number of Path Elements
recorded in PUTPATH. As PUTPATH is optional, this value may be
zero even if the message has traversed several peers. Set by the
initiator to the PATH_LEN of the PutMessage from which the block
originated. Modified by processing peers in case of path
truncation. In network byte order.
GETPATH_L is a 16-bit number indicating the number of Path Elements
recorded in GETPATH. As GETPATH is optional, this value may be
zero even if the message has traversed several peers. Set by the
initiator to 0. Modified by processing peers. In network byte
order.
EXPIRATION denotes the absolute 64-bit expiration date of the
content. In microseconds since midnight (0 hour), January 1, 1970
in network byte order. Set by the initiator to the expiration
value as recorded from the PutMessage from which the block
originated. Read-only.
QUERY_HASH the query hash corresponding to the GetMessage which
caused this reply message to be sent. Set by the initiator using
the value of the GetMessage. Read-only.
TRUNCATED ORIGIN is only provided if the TRUNCATED flag is set in
FLAGS. If present, this is the public key of the peer just before
the first entry on the PUTPATH and the first peer on the PUTPATH
is not the actual origin of the message. Thus, to verify the
first signature on the PUTPATH, this public key must be used.
Note that due to the truncation, this last hop cannot be verified
to exist. Set by processing peers.
Schanzenbach, et al. Expires 1 July 2023 [Page 30]
Internet-Draft The R5N Distributed Hash Table December 2022
PUTPATH the variable-length PUT path. The path consists of a list
of PUTPATH_L Path Elements. Set by the initiator to the the
PUTPATH of the PutMessage from which the block originated.
Modified by processing peers in case of path truncation.
GETPATH the variable-length PUT path. The path consists of a list
of GETPATH_L Path Elements. Set by processing peers.
LAST HOP SIGNATURE is only provided if the RecordRoute flag is set
in FLAGS. If present, this is an EdDSA signature of the sender of
this message (using the same format as the signatures in PUTPATH)
affirming that the sender forwarded the message from the
predecessor (all zeros if PATH_LEN is 0, otherwise the last peer
in PUTPATH) to the target peer.
BLOCK the variable-length resource record data payload. The
contents are defined by the respective type of the resource
record. Set by the initiator. Read-only.
6.5.2. Processing
Upon receiving a ResultMessage from a connected peer or triggered by
the processing of a GetMessage, an implementation MUST process it
step by step as follows:
1. First, the EXPIRATION field is evaluated. If the message is
expired, it MUST be discarded.
2. If the BTYPE is supported, then the BLOCK MUST be validated
against the requested BTYPE. To do this, the peer checks that
the block is valid using ValidateBlockStoreRequest. If the
result is BLOCK_INVALID, the message MUST be discarded.
3. If the PUTPATH_L or the GETPATH_L are non-zero, the local peer
SHOULD verify the signatures from the PUTPATH and the GETPATH.
Verification MAY involve checking all signatures or any random
subset of the signatures. It is RECOMMENDED that peers adapt
their behavior to available computational resources so as to not
make signature verification a bottleneck. If an invalid
signature is found, the path MUST be truncated to only include
the elements following the invalid signature. In particular, any
invalid signature on the GETPATH will cause PUTPATH_L to be set
to 0.
4. The peer also attempts to compute the key using DeriveBlockKey.
This may result in NONE. The result is used later. Note that
even if a key was computed, it does not have to match the
QUERY_HASH.
Schanzenbach, et al. Expires 1 July 2023 [Page 31]
Internet-Draft The R5N Distributed Hash Table December 2022
5. If the BTYPE of the message indicates a HELLO block, the peer
SHOULD be considered for the local routing table by using the
peer address computed from the block usingDeriveBlockKey. An
implementation MAY choose to ignore the HELLO, for example
because the routing table or the respective k-bucket is already
full. If the peer is a suitable candidate for insertion, the
local peer MUST try to establish a connection to the peer
indicated in the HELLO block using the address information from
the HELLO block and the Underlay function TRY_CONNECT. The
implementation MUST instruct the Underlay to connect to all
provided addresses using TRY_CONNECT in order to make the
underlay aware of multiple addresses for this connection. When a
connection is established, the signal PEER_CONNECTED will cause
the peer to be added to the respective k-bucket of the routing
table (Section 5).
6. If the QUERY_HASH of this ResultMessage does not match an entry
in the pending table (Section 5.5), then the message is discarded
and processing ends. Otherwise, processing continues for each
entry in the table as follows.
a) If the FindApproximate flag was not set in the query and the
BTYPE allowed the implementation to compute the key from the
block, the computed key must exactly match the QUERY_HASH,
otherwise the result does not match the pending query and
processing continues with the next pending query.
b) If the BTYPE is supported, result block MUST be validated
against the specific query using the respective
FilterBlockResult function. This function MUST update the
result filter if a result is returned to the originator of
the query.
c) If the BTYPE is not supported, filtering of exact duplicate
replies MUST still be performed before forwarding the reply.
Such duplicate filtering MAY be implemented
probabilistically, for example using a Bloom filter. The
result of this duplicate filtering is always either
FILTER_MORE or FILTER_DUPLICATE.
d) If the RecordRoute flag is set in FLAGS, the local peer
address MUST be appended to the GETPATH of the message and
the respective signature MUST be set using the query origin
as the PEER SUCCESSOR and the response origin as the PEER
PREDECESSOR. If the flag is not set, the GETPATH_L and
PUTPATH_L MUST be set to zero when forwarding the result.
Schanzenbach, et al. Expires 1 July 2023 [Page 32]
Internet-Draft The R5N Distributed Hash Table December 2022
e) If the result filter result is either FILTER_MORE or
FILTER_LAST, the message is forwarded to the origin of the
query as defined in the entry which may either be the local
peer or a remote peer. In case this is a query of the local
peer the result may have to be provided to applications
through the overlay API. Otherwise, the result is forwarded
using SEND(P, ResultMessage') where ResultMessage' is the now
modified message. If the result was FILTER_LAST, the query
is removed from the pending table.
8. Finally, the implementation SHOULD cache ResultMessages in order
to provide already seen replies to future GetMessages. The
implementation MAY choose not no cache any or a limited number of
ResultMessages for reasons such as resource limitations.
7. Blocks
This section describes various considerations R^5N implementations
must consider with respect to blocks. Specifically, implementations
SHOULD be able to validate and persist blocks. Implementations MAY
not support validation for all types of blocks. On some devices,
storing blocks MAY also be impossible due to lack of storage
capacity.
Applications can and should define their own block types. The block
type determines the format and handling of the block payload by peers
in PutMessages and ResultMessages. Block types MUST be registered
with GANA (see Section 10.1).
7.1. Block Operations
Block validation may be necessary for all types of DHT messages. To
enable these validations, any block type specification MUST define
the following functions:
ValidateBlockQuery(Key, XQuery) -> RequestEvaluationResult is used
to evaluate the request for a block as part of GetMessage
processing. Here, the block payload is unkown, but if possible
the XQuery and Key SHOULD be verified. Possible values for the
RequestEvaluationResult are:
REQUEST_VALID Query is valid.
REQUEST_INVALID Query format does not match block type. For
example, a mandatory XQuery was not provided, or of the size of
the XQuery is not appropriate for the block type.
Schanzenbach, et al. Expires 1 July 2023 [Page 33]
Internet-Draft The R5N Distributed Hash Table December 2022
DeriveBlockKey(Block) -> Key | NONE is used to synthesize the block
key from the block payload as part of PutMessage and ResultMessage
processing. The special return value of NONE implies that this
block type does not permit deriving the key from the block. A Key
may be returned for a block that is ill-formed.
ValidateBlockStoreRequest(Block) -> BlockEvaluationResult is used to
evaluate a block payload as part of PutMessage and ResultMessage
processing. Possible values for the BlockEvaluationResult are:
BLOCK_VALID Block is valid.
BLOCK_INVALID Block payload does not match the block type.
SetupResultFilter(FilterSize, Mutator) -> RF is used to setup an
empty result filter. The arguments are the set of results that
must be filtered at the initiator, and a MUTATOR value which MAY
be used to deterministically re-randomize probabilistic data
structures. The specification MUST also include the wire format
for BF.
FilterResult(Block, Key, RF, XQuery) -> (FilterEvaluationResult,
RF') is used to filter results against specific queries. This
function does not check the validity of Block itself or that it
matches the given key, as this must have been checked earlier.
Thus, locally stored blocks from previously observed
ResultMessages and PutMessages use this function to perform
filtering based on the request parameters of a particular GET
operation. Possible values for the FilterEvaluationResult are:
FILTER_MORE Valid result, and there may be more.
FILTER_LAST Last possible valid result.
FILTER_DUPLICATE Valid result, but duplicate (was filtered by the
result filter).
FILTER_IRRELEVANT Block does not satisfy the constraints imposed
by the XQuery.
If the main evaluation result is FILTER_MORE, the function also
returns an updated result filter where the block is added to the
set of filtered replies. An implementation is not expected to
actually differenciate between the FILTER_DUPLICATE and
FILTER_IRRELEVANT return values: in both cases the block is
ignored for this query.
Schanzenbach, et al. Expires 1 July 2023 [Page 34]
Internet-Draft The R5N Distributed Hash Table December 2022
7.2. HELLO Blocks
For bootstrapping and peer discovery, the DHT implementation uses its
own block type called "HELLO". HELLO blocks are the only type of
block that MUST be supported by every R^5N implementation. A block
with this block type contains the peer ID of the peer that published
the HELLO together with a set of addresses of this peer. The key of
a HELLO block is the SHA-512 of the peer ID and thus the peer's
address in the DHT.
The HELLO block type wire format is illustrated in Figure 13. A
query for block of type HELLO MUST NOT include extended query data
(XQuery). Any implementation encountering a request for a HELLO with
non-empty XQuery data MUST consider the request invalid and ignore
it.
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER-ID |
| (32 byte) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE |
| (64 byte) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| EXPIRATION |
+-----+-----+-----+-----+-----+-----+-----+-----+
/ ADDRESSES /
/ (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 13: The HELLO Block Format.
PEER-ID is the Peer-ID of the peer which has generated this HELLO.
EXPIRATION denotes the absolute 64-bit expiration date of the
content. The value specified is microseconds since midnight (0
hour), January 1, 1970, but must be a multiple of one million (so
that it can be represented in seconds in a HELLO URL). Stored in
network byte order.
Schanzenbach, et al. Expires 1 July 2023 [Page 35]
Internet-Draft The R5N Distributed Hash Table December 2022
ADDRESSES is a list of UTF-8 addresses (Section 2) which can be used
to contact the peer. Each address MUST be 0-terminated. The set
of addresses MAY be empty.
SIGNATURE is the signature of the HELLO. It covers a 64-bit pseudo
header derived from the information in the HELLO block. The
pseudo header includes the expiration time, a constant that
uniquely identifies the purpose of the signature, and a hash over
the addresses. The wire format is illustrated in Figure 14.
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIZE | PURPOSE |
+-----+-----+-----+-----+-----+-----+-----+-----+
| EXPIRATION |
+-----+-----+-----+-----+-----+-----+-----+-----+
| H_ADDRS |
| (64 byte) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 14: The Wire Format of the HELLO for Signing.
SIZE A 32-bit value containing the length of the signed data in
bytes in network byte order. The length of the signed data
MUST be 80 bytes.
PURPOSE A 32-bit signature purpose flag. This field MUST be 7
(in network byte order).
EXPIRATION denotes the absolute 64-bit expiration date of the
HELLO. In microseconds since midnight (0 hour), January 1,
1970 in network byte order.
H_ADDRS a SHA-512 hash over the addresses in the HELLO. H_ADDRS
is generated over the ADDRESSES field as provided in the HELLO
block using SHA-512 [RFC4634].
The HELLO block functions MUST be implemented as follows:
ValidateBlockQuery(Key, XQuery) -> RequestEvaluationResult To
Schanzenbach, et al. Expires 1 July 2023 [Page 36]
Internet-Draft The R5N Distributed Hash Table December 2022
validate a block query for a HELLO is to simply check that the
XQuery is empty. If it is empty, REQUEST_VALID ist returned.
Otherwise, REQUEST_INVALID.
DeriveBlockKey(Block) -> Key | NONE To derive a block key for a
HELLO is to simply hash the peer ID from the HELLO. The result of
this function is always the SHA-512 hash over the PEER-ID.
ValidateBlockStoreRequest(Block) -> BlockEvaluationResult To
validate a block store request is to verify the EdDSA SIGNATURE
over the hashed ADDRESSES against the public key from the peer ID
field. If the signature is valid BLOCK_VALID is returned.
Otherwise BLOCK_INVALID.
SetupResultFilter(FilterSize, Mutator) -> RF The RESULT_FILTER for
HELLO blocks is implemented using a Bloom filter following the
definition from Appendix A and consists of a variable number of
buckets L. L depends on the number of connected peers |E| known
to the peer creating a HELLO block from its own addresses: L is
set to the minimum of 2^18 bits (2^15 bytes) and the lowest power
of 2 that is strictly larger than 2*K*|E| bits (K*|E|/4 bytes).
The k-value for the Bloom filter is 16. The elements used in the
Bloom filter consist of an XOR between the H_ADDRS field (as
computed using SHA-512 over the ADDRESSES) and the SHA-512 hash of
the MUTATOR field from a given HELLO block. The mapping function
M(H_ADDRS XOR MUTATOR) is defined as follows:
M(e = H_ADDR XOR MUTATOR) -> e as uint32[]
M is an identity function and returns the 512-bit XOR result
unmodified. This resulting byte string is interpreted as k=16
32-bit integers in network byte order which are used to set and
check the bucket bits in B using BF-SET and BF-TEST. The 32-bit
Mutator is prepended to the L-bit Bloom filter bucket field
HELLO_BF containing B to create the result filter for a HELLO
block:
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| MUTATOR | HELLO_BF /
+-----+-----+-----+-----+ (variable length) /
/ /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 15: The HELLO Block Result Filter.
where:
Schanzenbach, et al. Expires 1 July 2023 [Page 37]
Internet-Draft The R5N Distributed Hash Table December 2022
MUTATOR The 32-bit mutator for the result filter.
HELLO_BF The L-bit Bloom filter buckets byte array.
The MUTATOR value is used to additionally "randomize" the
computation of the Bloom filter while remaining deterministic
across peers. It is only ever set by the peer initiating the GET
request, and changed every time the GET request is repeated.
Peers forwarding GET requests MUST not change the mutator value
included in the RESULT_FILTER as they might not be able to
recalculate the result filter with a different MUTATOR value.
Consequently, repeated requests have statistically independent
probabilities of creating false-positives in a result filter.
Thus, even if for one request a result filter may exclude a result
as a false-positive match, subsequent requests are likely to not
have the same false-positives.
HELLO result filters can be merged if the Bloom filters have the
same size and MUTATOR by setting all bits to 1 that are set in
either Bloom filter. This is done whenever a peer receives a
query with the same MUTATOR, predecessor and Bloom filter size.
FilterResult(Block, Key, RF, XQuery) -> (FilterEvaluationResult,
RF') The H_ADDRS field is XORed with the SHA-512 hash of the MUTATOR
field from the HELLO block and the resulting value is checked
against the Bloom filter in RF. Consequently, HELLOs with
completely identical sets of addresses will be filtered and
FILTER_DUPLICATE is returned. Any small variation in the set of
addresses will cause the block to no longer be filtered (with high
probability) and FILTER_MORE is returned.
7.3. Persistence
An implementation SHOULD provide a local persistence mechanism for
blocks. Embedded systems that lack storage capability MAY use
connection-level signalling to indicate that they are merely a client
utilizing a DHT and are not able to participate with storage. The
local storage MUST provide the following functionality:
Store(Key, Block) Stores a block under the specified key. If an
block with identical payload exists already under the same key,
the meta data should be set to the maximum expiration time of both
blocks and use the corresponding PUTPATH (and if applicable
TRUNCATED ORIGIN) of that version of the block.
Lookup(Key) -> List of Blocks Retrieves blocks stored under the
specified key.
Schanzenbach, et al. Expires 1 July 2023 [Page 38]
Internet-Draft The R5N Distributed Hash Table December 2022
LookupApproximate(Key) -> List of Blocks Retrieves the blocks stored
under the specified key and any blocks under keys close to the
specified key, in order of decreasing proximity.
7.3.1. Approximate Search Considerations
Over time a peer may accumulate a significant number of blocks which
are stored locally in the persistence layer. Due to the expected
high number of blocks, the method to retrieve blocks close to the
specified lookup key in the LookupApproximate API must be implemented
with care with respect to efficiency.
It is RECOMMENDED to limit the number of results from the
LookupApproximate procedure to a result size which is easily
manageable by the local system.
In order to efficiently find a suitable result set, the
implementation SHOULD follow the following procedure:
1. Sort all blocks by the block key in ascending (decending) order.
The block keys are interpreted as integer.
2. Alternatingly select a block with a key larger and smaller from
the sortings. The resulting set is sorted by XOR distance. The
selection process continues until the upper bound for the result
set is reached and both sortings do not yield any closer blocks.
An implementation MAY decide to use a custom algorithm in order to
find the closest blocks in the local storage. But, especially for
more primitive approaches, such as only comparing XOR distances for
all blocks in the storage, the procedure may become ineffective for
large storages.
7.3.2. Caching Strategy Considerations
An implementation MUST implement an eviction strategy for blocks
stored in the block storage layer.
In order to ensure the freshness of blocks, an implementation MUST
evict expired blocks in favor of new blocks.
An implementation MAY preserve blocks which are often requested.
This approach can be expensive as it requires the implementation to
keep track of how often a block is requested.
An implementation MAY preserve blocks which are close to the local
peer ID.
Schanzenbach, et al. Expires 1 July 2023 [Page 39]
Internet-Draft The R5N Distributed Hash Table December 2022
An implementation MAY provide configurable storage quotas and adapt
its eviction strategy based on the current storage size or other
constrained resources.
8. Security Considerations
If an upper bound to the maximum number of neighbors in a k-bucket is
reached, the implementation MUST prefer to preserve the oldest
working connections instead of new connections. This makes Sybil
attacks less effective as an adversary would have to invest more
resources over time to mount an effective attack.
The ComputeOutDegree function limits the REPL_LVL to a maximum of 16.
This imposes an upper limit on bandwidth amplification an attacker
may achieve for a given network size and topology.
8.1. Approximate Result Filtering
When a FindApproximate request is encountered, a peer will try to
respond with the closest block it has that is not filtered by the
result bloom filter. Implementations MUST ensure that the cost of
evaluating any such query is reasonably small. For example,
implementations MAY consider to avoid an exhaustive search of their
database. Not doing so can lead to denial of service attacks as
there could be cases where too many local results are filtered by the
result filter.
8.2. Access control
By design R^5N does not rely on strict admission control through the
use of either centralized enrollment servers or pre-shared keys.
This is a key distintion over protocols that do rely on this kind of
access control such as [RFC6940] which, like R^5N, provides a peer-
to-peer (P2P) signaling protocol with extensible routing and topology
mechanisms. Some decentralized applications such as the GNU Name
System ([I-D.schanzen-gns]) require a more open system that enables
ad-hoc participation and other means to prevent common attacks on P2P
overlays. GNS, for example, would be in conflict with its goals of
providing a solution to the issues of a "Single Hierarchy with a
Centrally Controlled Root" and "Distribution and Management of Root
Servers" in DNS as raised in [RFC8324].
9. IANA Considerations
IANA maintains a registry called the "Uniform Resource Identifier
(URI) Schemes" registry.
Schanzenbach, et al. Expires 1 July 2023 [Page 40]
Internet-Draft The R5N Distributed Hash Table December 2022
9.1. GNUnet URI Scheme Registration
IANA maintains the "Uniform Resource Identifier (URI) Schemes"
registry. The registry should be updated to include an entry for the
'gnunet' URI scheme. IANA is requested to update that entry to
reference this document when published as an RFC.
9.2. R5N URI Scheme Registration
IANA maintains the "Uniform Resource Identifier (URI) Schemes"
registry. The registry should be updated to include an entry for the
'r5n+udp+ip' URI scheme. IANA is requested to update that entry to
reference this document when published as an RFC.
10. GANA Considerations
10.1. Block Type Registry
GANA [GANA] is requested to create a "DHT Block Types" registry. The
registry shall record for each entry:
* Name: The name of the block type (case-insensitive ASCII string,
restricted to alphanumeric characters
* Number: 32-bit
* Comment: Optionally, a brief English text describing the purpose
of the block type (in UTF-8)
* Contact: Optionally, the contact information of a person to
contact for further information
* References: Required, references (such as an RFC) specifying the
block type and its block functions
The registration policy for this sub-registry is "First Come First
Served", as described in [RFC8126]. GANA created the registry as
follows:
Number| Name | References | Description
------+----------------+------------+-------------------------
0 ANY [This.I-D] Reserved
13 DHT_HELLO [This.I-D] Address data for a peer
Contact: r5n-registry@gnunet.org
Figure 16: The Block Type Registry.
Schanzenbach, et al. Expires 1 July 2023 [Page 41]
Internet-Draft The R5N Distributed Hash Table December 2022
10.2. GNUnet URI schema Subregistry
GANA [GANA] is requested to create a "gnunet://" sub-registry. The
registry shall record for each entry:
* Name: The name of the subsystem (case-insensitive ASCII string,
restricted to alphanumeric characters
* Comment: Optionally, a brief English text describing the purpose
of the subsystem (in UTF-8)
* Contact: Optionally, the contact information of a person to
contact for further information
* References: Optionally, references describing the syntax of the
URL (such as an RFC or LSD)
The registration policy for this sub-registry is "First Come First
Served", as described in [RFC8126]. GANA created this registry as
follows:
Name | References | Description
---------------+------------+-------------------------
HELLO [This.I-D] How to contact a peer.
ADDRESS N/A Network address.
Contact: gnunet-registry@gnunet.org
Figure 17: GNUnet scheme Subregistry.
10.3. GNUnet Signature Purpose Registry
GANA amended the "GNUnet Signature Purpose" registry as follows:
Purpose | Name | References | Description
--------+-----------------+------------+---------------
6 DHT PATH Element [This.I-D] DHT message routing data
7 HELLO Payload [This.I-D] Peer contact information
Figure 18: The Signature Purpose Registry Entries.
10.4. GNUnet Message Type Registry
GANA is requested to amend the "GNUnet Message Type" registry as
follows:
Schanzenbach, et al. Expires 1 July 2023 [Page 42]
Internet-Draft The R5N Distributed Hash Table December 2022
Type | Name | References | Description
--------+-----------------+------------+---------------
146 DHT PUT [This.I-D] Store information in DHT
147 DHT GET [This.I-D] Request information from DHT
148 DHT RESULT [This.I-D] Return information from DHT
157 HELLO Message [This.I-D] Peer contact information
Figure 19: The Message Type Registry Entries.
11. Test Vectors
12. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November
2003, <https://www.rfc-editor.org/info/rfc3629>.
[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
Resource Identifier (URI): Generic Syntax", STD 66,
RFC 3986, DOI 10.17487/RFC3986, January 2005,
<https://www.rfc-editor.org/info/rfc3986>.
[RFC4634] Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms
(SHA and HMAC-SHA)", RFC 4634, DOI 10.17487/RFC4634, July
2006, <https://www.rfc-editor.org/info/rfc4634>.
[RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", STD 68, RFC 5234,
DOI 10.17487/RFC5234, January 2008,
<https://www.rfc-editor.org/info/rfc5234>.
[RFC6940] Jennings, C., Lowekamp, B., Ed., Rescorla, E., Baset, S.,
and H. Schulzrinne, "REsource LOcation And Discovery
(RELOAD) Base Protocol", RFC 6940, DOI 10.17487/RFC6940,
January 2014, <https://www.rfc-editor.org/info/rfc6940>.
[RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for
Writing an IANA Considerations Section in RFCs", BCP 26,
RFC 8126, DOI 10.17487/RFC8126, June 2017,
<https://www.rfc-editor.org/info/rfc8126>.
Schanzenbach, et al. Expires 1 July 2023 [Page 43]
Internet-Draft The R5N Distributed Hash Table December 2022
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC8324] Klensin, J. and RFC Publisher, "DNS Privacy,
Authorization, Special Uses, Encoding, Characters,
Matching, and Root Structure: Time for Another Look?",
RFC 8324, DOI 10.17487/RFC8324, February 2018,
<https://www.rfc-editor.org/info/rfc8324>.
[I-D.schanzen-gns]
Schanzenbach, M., Grothoff, C., and B. Fix, "The GNU Name
System", Work in Progress, Internet-Draft, draft-schanzen-
gns-21, 7 August 2022, <https://www.ietf.org/archive/id/
draft-schanzen-gns-21.txt>.
[ed25519] Bernstein, D., Duif, N., Lange, T., Schwabe, P., and B.
Yang, "High-Speed High-Security Signatures", 2011,
<http://link.springer.com/
chapter/10.1007/978-3-642-23951-9_9>.
[GANA] GNUnet e.V., "GNUnet Assigned Numbers Authority (GANA)",
April 2020, <https://gana.gnunet.org/>.
13. Informative References
[R5N] Evans, N. S. and C. Grothoff, "R5N: Randomized recursive
routing for restricted-route networks", 2011,
<https://doi.org/10.1109/ICNSS.2011.6060022>.
[Kademlia] Maymounkov, P. and D. Mazieres, "Kademlia: A peer-to-peer
information system based on the xor metric.", 2002,
<http://css.csail.mit.edu/6.824/2014/papers/kademlia.pdf>.
[cadet] Polot, B. and C. Grothoff, "CADET: Confidential ad-hoc
decentralized end-to-end transport", 2014,
<https://doi.org/10.1109/MedHocNet.2014.6849107>.
Appendix A. Bloom filters in R^5N
R^5N uses Bloom filters in several places. This section gives some
general background on Bloom filters and defines functions on this
data structure shared by the various use-cases in R^5N.
Schanzenbach, et al. Expires 1 July 2023 [Page 44]
Internet-Draft The R5N Distributed Hash Table December 2022
A Bloom filter (BF) is a space-efficient probabilistic datastructure
to test if an element is part of a set of elements. Elements are
identified by an element ID. Since a BF is a probabilistic
datastructure, it is possible to have false-positives: when asked if
an element is in the set, the answer from a BF is either "no" or
"maybe".
Bloom filters are defined as a string of L bits called "buckets".
The buckets are initially always empty, meaning that the bits are set
to zero. There are two functions which can be invoked on the Bloom
filter "bf": BF-SET(bf, e) and BF-TEST(bf, e) where "e" is an element
that is to be added to the Bloom filter or queried against the set.
A mapping function M is used to map each ID of each element from the
set to a subset of k buckets. In the original proposal by Bloom, M
is non-injective and can thus map the same element multiple times to
the same bucket. The type of the mapping function can thus be
described by the following mathematical notation:
------------------------------------
# M: E->B^k
------------------------------------
# L = Number of buckets
# B = 0,1,2,3,4,...L-1 (the buckets)
# k = Number of buckets per element
# E = Set of elements
------------------------------------
Example: L=256, k=3
M('element-data') = {4,6,255}
Figure 20: Bloom filter mapping function.
When adding an element to the Bloom filter bf using BF-SET(bf,e),
each integer n of the mapping M(e) is interpreted as a bit offset n
mod L within bf and set to 1.
When testing if an element may be in the Bloom filter bf using BF-
TEST(bf,e), each bit offset n mod L within bf MUST have been set to
1. Otherwise, the element is not considered to be in the Bloom
filter.
Schanzenbach, et al. Expires 1 July 2023 [Page 45]
Internet-Draft The R5N Distributed Hash Table December 2022
Appendix B. Overlay Operations
An implementation of this specification commonly exposes the two
overlay operations "GET" and "PUT". The following are non-normative
examples of APIs for those operations. Their behaviour is described
prosaically in order to give implementers a fuller picture of the
protocol.
B.1. GET operation
A basic GET operation interface may be exposed as:
GET(Query-Key, Block-Type) -> Results as List
The procedure typically takes at least two arguments to initiate a
lookup:
QueryKey: is the 512-bit key to look for in the DHT.
Block-Type: is the type of block to look for, possibly "any".
The GET procedure may allow a set of optional parameters in order to
control or modify the query:
Replication-Level: is an integer which controls how many nearest
peers the request should reach.
Flags: is a 16-bit vector which indicates certain processing
requirements for messages. Any combination of flags as defined in
Section 6.1.2 may be specified.
eXtended-Query (XQuery): is medatadata which may be required
depending on the respective Block-Type. A Block-Type must define
if the XQuery can or must be used and what the specific format of
its contents should be. Extended queries are in general used to
implement domain-specific filters. These might be particularly
useful in combination with FindApproximate to add a well-defined
filter by an application-specific distance. Regardless, the DHT
does not define any particular semantics for an XQuery. See also
Section 7.
Result-Filter: is data for a Block-type-specific filter which allows
applications to indicate results which are not relevant anymore to
the caller (see Section 6.4.2).
Schanzenbach, et al. Expires 1 July 2023 [Page 46]
Internet-Draft The R5N Distributed Hash Table December 2022
The GET procedure should be implemented as an asynchronous operation
that returns individual results as they are found in the DHT. It
should terminate only once the application explicitly cancels the
operation. A single result commonly consists of:
Block-Type: is the desired type of block in the result.
Block-Data: is the application-specific block payload. Contents are
specific to the Block-Type.
Block-Expiration: is the expiration time of the block. After this
time, the result should no longer be used.
Key: is the key under which the block was stored. This may be
different from the key that was queried if the flag
FindApproximate was set.
GET-Path: is a signed path of the IDs of peers which the query
traversed through the network. The DHT will try to make the path
available if the RecordRoute flag was set by the application
calling the PUT procedure. The reported path may have been
silently truncated from the beginning.
PUT-Path: is a signed path of the IDs of peers which the result
message traversed. The DHT will try to make the path available if
the RecordRoute flag was set for the GET procedure. The reported
path may have been silently truncated from the beginning. As the
block was cached by the node at the end of this path, this path is
more likely to be stale compared to the GET-Path.
B.2. PUT operation
A PUT operation interface may be exposed as:
PUT(Key, Block-Type, Block-Expiration, Block-Data)
The procedure typically takes at least four parameters:
Key: is the key under which to store the block.
Block-Type: is the type of the block to store.
Block-Expiration: specifies when the block should expire.
Block-Data: is the application-specific payload of the block to
store.
Schanzenbach, et al. Expires 1 July 2023 [Page 47]
Internet-Draft The R5N Distributed Hash Table December 2022
The PUT procedure may allow a set of optional parameters in order to
control or modify the query:
Replication-Level: is an integer which controls how many nearest
peers the request should reach.
Flags: is a bit-vector which indicates certain processing
requirements for messages. Any combination of flags as defined in
Section 6.1.2 may be specified.
The PUT procedure does not necessarily yield any information.
Appendix C. HELLO URLs
The general format of a HELLO URL uses "gnunet://" as the scheme,
followed by "hello/" for the name of the GNUnet subsystem, followed
by "/"-separated values with the GNS Base32 encoding
([I-D.schanzen-gns]) of the Peer ID, a Base32-encoded EdDSA
signature, and an expiration time in seconds since the UNIX Epoch in
decimal format. After this a "?" begins a list of key-value pairs
where the key is the URI scheme of one of the peer's addresses and
the value is the URL-escaped payload of the address URI without the
"://".
For example, consider the following URL:
gnunet://hello/RH1M20EPK834M6MHZ72\
G3CMBSF3ECKNY4W0T9VAQP9Z7SZEM6Y3G/\
NGRTAH6RA04X467CGCH7M7CEXR5F9CV5HT\
ZFK0G9BWETY3CCE2QWGVT4WA7JN5M9HMWG\
60A00R71F1PJP8N5628EKGHHBAGA7M8JW3\
0/1647134480?udp=127.0.0.1%3A2086
FIXME: signature is invalid, should
maybe generate proper test vector.
Figure 21
It specifies that the peer with the ID "RH1M...6Y3G" is reachable via
"udp" at 127.0.0.1 on port 2086 until 1647134480 seconds after the
Epoch. Note that "udp" here is underspecified and just used as a
simple example. In practice, the key (addr-name) refers to a scheme
supported by a DHT Underlay.
The general syntax of HELLO URLs specified using Augmented Backus-
Naur Form (ABNF) of [RFC5234] is:
Schanzenbach, et al. Expires 1 July 2023 [Page 48]
Internet-Draft The R5N Distributed Hash Table December 2022
hello-URL = "gnunet://hello/" meta [ "?" addrs ]
meta = pid "/" sig "/" exp
pid = *bchar
sig = *bchar
exp = *DIGIT
addrs = addr *( "&" addr )
addr = addr-name "=" addr-value
addr-name = scheme
addr-value = *pchar
bchar = *(ALPHA / DIGIT)
Figure 22
'scheme' is defined in [RFC3986] in Section 3.1. 'pchar' is defined
in [RFC3986], Appendix A.
Authors' Addresses
Martin Schanzenbach
Fraunhofer AISEC
Lichtenbergstrasse 11
85748 Garching
Germany
Email: martin.schanzenbach@aisec.fraunhofer.de
Christian Grothoff
Berner Fachhochschule
Hoeheweg 80
CH-2501 Biel/Bienne
Switzerland
Email: grothoff@gnunet.org
Bernd Fix
GNUnet e.V.
Boltzmannstrasse 3
85748 Garching
Germany
Email: fix@gnunet.org
Schanzenbach, et al. Expires 1 July 2023 [Page 49]