StorageProcessingUnit - LeFreq/Singularity GitHub Wiki

Disks cost 1000x more for dealing with so offload storage with a dedicated processor to do journaled file systems. If a typical user session uses, say 100,000 [mostly] objects, then we'll probably use hashes to store these and flat, monolithic storage for large data sets.

Objects must have unique names, so we can hash off the name. For structured data, instead of trees and splitting the data, we can write it all at once.


A Storage Processing Unit could buffer a intake file, check if the file is a diff or a flat file to store, and then to various sanity checks on the storage hardware (see if the current file is compatible with the diff to see if the disk changed state somehow since last known state).

If the there aren't many changes, then could make/find algorithm to figure out whether to write the journaled differences or to rewrite

Queue up files to be refactored (defragmented from journal writes), etc.

Commands/OpCodes:

  • WRITEDIFF filename diffdata, WRITEFLAT filename offset data
  • READ filename offset x (words)
  • DEL filename
⚠️ **GitHub.com Fallback** ⚠️