File sync between data nodes - radumarias/rfs GitHub Wiki
When user uploads a file the master split the file in shards and distribute the shards and replicas to data nodes. Then data nodes use DHT
and BitTorrent
to sync the shards saving the status to tikv
.
In order for coordinator to catch cases when a data node goes does down we keep a queue of ongoing sync tasks and periodically check in there the status and if a data node is down it allocate the replica to another node.
BitTorrent over μTP
To optimize transfer we can implement the transport layer on BitTorrent
client using uTP
and zero-copy https://chatgpt.com/share/42962c3a-992f-49d6-a774-478be7faff4a . This helps transferring the data from the disk directly to the network layer without involving the OS's buffer and minimizing CPU usage.
https://lib.rs/crates/async-utp
QUIC
and zero-copy
with sendfile()
.
Take advantage of - https://yusufonlinux.blogspot.com/2010/11/data-link-access-and-zero-copy.html?m=1
- https://lwn.net/Articles/655299/
- https://lib.rs/crates/rqbit
Check if this is for DHT. It's interesting as it's UDP tracker https://lib.rs/crates/aquatic_udp_protocol
Encryption
- https://en.wikipedia.org/wiki/BitTorrent_protocol_encryption
- https://forum.utorrent.com/topic/58845-utp-encryption/
- https://chatgpt.com/c/673b88e1-e124-8003-bf65-63e29e0a8ea9
Authentication
We need some kind of auth with Keycloak, JWT or any token based solution. If there is nothing in the standard then implement on our custom version.
Implementing a BitTorrent in Rust
https://www.youtube.com/watch?v=r0srf3kfZbs https://www.youtube.com/watch?v=jf_ddGnum_4