Unknown Facts On Bit Torrent Architecture
I will talk about lesser unknow facts & internals of Bit torrent architecture , overlook of Peers,Seeders,leechers,choke algorithm,free riders and much more…
Overview
BitTorrent is a peer-to-peer (P2P) file sharing protocol designed for distributing large files over the Internet. It is designed as a scalable, efficient, and resilient protocol that allows millions of users to easily share files with one another without having a dependency of central server.
The bottleneck of a central server bandwidth and availability is eliminated by relying on distributed network of computers (seeds) of the P2P network.
It involves lot of components to form the Architecture->
- Peers: It is a node in the system which can act as seeder or a leecher, in normal terms, any computer which is connected to the internet can act as a peer. Now in case of BitTorrent protocol , any computer/client who has installed any bit torrent clients such as uTorrent can act as a peer.
- Seeders: A type of peers who sends/feeds the file into the network, it may feed the whole file or fragments/pieces of the file.
- Leechers : Types of peers who are downloading the pieces of the file and waiting for the completion of the file.
- Trackers : It is a central Entity that holds the information about all of the peers participating in the torrent network.
Lifecycle of a Torrent File
As long as there is at least one seeder sharing the pieces in the network, the Torrent will remain active. However, if all the nodes capable of sharing the pieces leave the network, the download process will come to a halt and the Torrent will end. The Torrent will only resume when a new seeder appears.
Components of a Torrent File
Any “.torrent” file holds static information about the content such as:
1. Announce: Contains the tracker URL.
2. Created by: name and version of the program
3. Creation date: date and time of the torrent file creation
4. Encoding: of the strings that ‘info’ dictionary contains
5. Info: It is a dictionary that describes the files of the torrent. It also holds the SHA1 of every piece of the file shared. And the piece length — number of bytes in each piece.
Processing of Torrent File
Each torrent file is encoded before shared over the torrent network, it is a custom encoding which supports data types such as strings, integers,list and Dictionaries.
Each data types has it owns way of getting encoded and then a bencode parser is used to parse the bencoded file to get decoded.
Kademlia DHT for Communication
It is a network protocol that helps BitTorrent clients locate and communicate with each other without the need for a central server or tracker.
In a Kademlia DHT network, each peer has a unique identifier called a “node ID” which is a hash of its IP address and port number. When a peer joins the network, it sends a “ping” message to its closest neighbor. The neighbor replies with its own node ID, and the new peer updates its routing table accordingly. This process continues, and each peer builds a routing table of other peers in the network based on their node IDs.
When a peer wants to find a file, it generates a “infohash” by hashing the file name, length, and other metadata. The peer then queries the Kademlia DHT network with the infohash, and the network responds with a list of peers that are currently sharing the file. The peer can then connect to those peers and start downloading the file.
The Kademlia DHT network allows for efficient and reliable peer discovery, even in the absence of a tracker or central server. It is decentralized, which means it is more resistant to censorship and network failures. The Kademlia DHT system is an essential component of BitTorrent and helps to facilitate fast and secure file sharing between peers.
Piece Selection Algorithm’s
A piece is a unit of transmission and when a peer has all the pieces it concatenates them and re-creates the entire file.
It is crucial to have a piece selection strategy in place because if all peers start by requesting the first piece of a file, it could lead to an unbalanced distribution of load on the peers and seeders who possess that piece. This could result in a slower overall speed of distribution as those peers may become overwhelmed with requests, while others who have less in-demand pieces may not be utilized effectively.
I will discuss few common and most widely used Piece Selection Algorithm’s
1. Rarest-First Piece Selection
Idea: Firstly prioritize the download of the piece that is the rarest in the network.
Advantages:
1. Spreading the seed
2. Increases the download speed
3. Preventing the rarer missing piece
2. Random First Policy
Idea: Whenever new peers joins the network, they are given the random pieces, so that they can start participating in the network.
Issues Associated with Bit Torrent File Sharing System
One issue is that it can be difficult to ensure that all peers are sharing the file fairly. Because peers are free to choose which pieces of the file they download and share, it is possible for some peers to download more than their fair share while others are left with less.
Bit Torrent Tackles this problem very beautifully by using “Choke Algorithm”
The choke algorithm ensures
- maximum download speed to genuine peers
- minimal network abuse by free riders
The choke algorithm in BitTorrent is used to regulate the number of concurrent connections a peer makes to other peers in the network. A peer can unchoke up to a certain number of other peers based on their upload rate. Peers with a higher upload rate are prioritized, while others may be choked to limit the total number of connections. Periodically, peers are reevaluated and may be unchoked or choked depending on their current upload rate. This algorithm helps to optimize the use of available bandwidth and ensure that each peer is contributing to the overall download process.
How to find peers to unchoke?
If a peer unchokes every other peer, then everyone would want to download the file from it, and hence the peer would be overwhelmed. Thus, a peer always prioritizes unchoking a peer that offers the best download speed.
This prioritization is based on the principle of reciprocity wherein a peer will send data to the peers from whom it gets the data faster. This encourages peers to let others download from it and prevent free riders, who never upload, from abusing the network.
Choke algorithm for Leecher
Leecher “A” utilizes a regular unchoke algorithm in which every 10 seconds, it sorts the interested peers based on their download rate to “A”, and unchokes the fastest 3. The unchoking is temporary, and after some time, an unchoked peer may get choked again by “A”, and a different peer could be unchoked in its place. This algorithm helps to ensure that “A” is constantly connected to the fastest peers and makes the most efficient use of available bandwidth. It also allows other peers to have an opportunity to contribute to the overall download process.
Conclusion
In conclusion, the BitTorrent architecture is a powerful and efficient method of file sharing that has revolutionized the way large files are distributed over the Internet. By using a decentralized peer-to-peer network, BitTorrent is able to distribute files quickly and efficiently while also being resilient to network disruptions and failures. While there are some challenges associated with the system.