-
Notifications
You must be signed in to change notification settings - Fork 173
What are Distributed Hash Tables ?
OpenDHT implements a Distributed Hash Table (DHT) of the Kademlia kind. A DHT can be viewed as a dictionary distributed over a network: it provides access to a common shared key->value data-store, distributed over participating nodes with great performance and scalability.
From a user perspective, a DHT essentially provides a map interface, with two main operations: put(key, value)
and get(key)
. Get will retrieve values stored at a certain key while put (often called announce) will store a value on the network. Note that many values can be stored under the same key.
The Kademlia DHT algorithm allows to contact only O(log(n))
nodes for a get operation, n being the number of nodes in the network. This property makes DHTs very scalable as demonstrated, for instance, by the mainline BitTorrent DHT running with tens of millions of nodes.
Every node in the network has a random ID (of 160 bits for OpenDHT). Every value stored on the network has a randomly distributed ID of the same length.
Kademlia DHTs define XOR as the distance between keys or between keys and node IDs, and use an iterative search algorithm working as follow.
Every node maintains a routing table of known node IDs and their IP.
To find the nodes globally responsible for the storage of key H, a node S will contact the nodes in his table that are the closest to H according to the XOR metric. Requests include the key H and answers include values for H (if any) and IDs and IPs of new nodes closest to H. Those new closest nodes are then contacted until there is no new closest node.
The same iterative process is used when a new node enter the network : the node performs the algorithm with its own node ID to find neighbors. By doing so, contacted neighbors will then also be aware of the new node.