Peer-to-peer: comment ça marche ?

Date:
Author:
Source:
Napster, Gnutella, Kazaa, BitTorrent, we have been sharing files with tools on the net for a while... Did you know that they rely on peer-to-peer computing technology?
In a network organized in a peer-to-peer infrastructure, users share resources through direct exchange between computers, which are called “nodes”. The data is distributed among the nodes instead of being sent to servers for processing. Unlike in client-server computing technology each node plays a symmetric and autonomous role to provide to the end user the expected solution.
Peer-to-peer systems have a number of characteristics:
  • Symmetric, distributed & decentralized: all the nodes play a similar role, acting as both client and server. They fetch, distribute and process content.
  • Dynamic participants: peer-to-peer systems must be resilient to nodes joining and leaving, whereas a centralized system expects its servers to remain up at all times.
  • Resource localization: one of the key challenges in peer-to-peer filesystems is to find the peer hosting the requested data. A well-known technique is to use a Distributed Hash Table (DHT), such as "Kademlia", which is itself distributed among the peers. These algorithms are very efficient and scalable even with a large number of nodes and resources.
  • Rebalancing and replication: as nodes become overloaded, or leave the network the peer- to-peer system must ensure that services remain accessible, available, performant and data remains persistent.
  • Scalability and security: in a peer-to-peer network there could be millions of nodes. Skype at its peak had over 300M users. Such networks must have irreproachable security tools which don't deteriorate as the size of the network grows.
Hive's peer-to-peer storage relies on these core principles. It is based on the open source IPFS protocol for the core filesystem layer, on top of which we have built additional services and features to provide:
  • End-to-end encryption: no private data leaves the end users' device in clear form. The encryption model enables the sharing of data across multiple participants without replicating the content, nor sharing keys.
  • Proof of storage: as participants in the Hive network store other's data, they are incentivized to do so as long as they continue providing proof of storage.
  • Location awareness: Hive's peer-to-peer placement algorithms take into account the user's privacy requirements and preferred locations for both data storage and processing.
  • Error correction: when nodes suddenly go offline, the data they hold is no longer available. When such an event occurs Hive will recreate the lost data and distribute it to other nodes to ensure durability of the stored files.
One may wonder why peer-to-peer technologies which are mature since the mid 2000s aren't more in use. Well they are omnipresent already in gaming, crypto world, and content distribution.
But it is only recently that the technology environment evolution has aligned all the stars for peer-to-peer to reach its full potential:
1. Fiber now in many countries more common than DSL, which has brought symmetry between upload and download speeds.
2. Data is now produced at the edge more than ever; IOT devices have outgrown non IOT devices.
3. There is more power and capacity than ever in edge devices that are growing to be billions...
In the years to come, it is inevitable that computation and storage will logically move away from centralized, distant servers to distributed systems closer to the end users. The amount of data produced and stored in the Internet is massive and growing by approximately 20% every year. The world's storage capacity is expected to reach 13 ZB by 2024, vs. 6.8 ZB today. As an alternative to huge data centers for storing all this data, Hive's peer-to-peer storage system will rely on the important free capacities sitting in our personal devices at the edge of the network.