Hash Caching - RetroShare/documentation GitHub Wiki
(text wrote in 2010)
It's a pain that RS wastes the already computed hash of files when it cannot find them anymore. This happens e.g. when you move a file from a directory to another without modifying the file, and when you share an external HDD: if you start RS without the HDD, then all the hashes are lost and you will have to re-hash from scratch next time you plug your HDD in.
Hash caching is the solution, and it's a simple one: I plan, in fimonitor.cc, to maintain a list of quadruples (filename, size, hash, timestamp) for recently hashed files, without structure. This list will be cleared regularly to only keep recent info, to be parameterized in the GUI. We could have a box named "Keep memory of recently hashed files for [to be setup by user] days".
When hashing new files, fimonitor will first look into this list to check whether the hash is already known. If yes, it won't hash again, but instead copy the info from the hash cache list.
This looks like we'll be facing the problem of not re-hashing files that have been changed, but in fact we won't, because the info that is used to re-use a hash is the same than the info that fimonitor already check for on existing file to re-hash them.
Possible improvements:
- only put in the list files that are not in the shared file list, to avoid duplicates, although the info might not be so much large.