The continuous evolution of the project - documenti-aperti/documenti_aperti GitHub Wiki

During our intership we changed a lot of things about this project and what we reached in the end was something that we didn't consider in the beginning.

The beginning of the project

At the start of the project we were offered these means:

  • The portable scanner (IRIScan book 5 wifi)
  • A Raspberry Pi 3 model B
  • 4G TP-Link portable router
  • One USB wifi adapter
  • 550 pages of old texts to scan

The initial idea

What we had in mind was a decentralized system created by more Raspberries, so that everyone of them could identify a single library (because the project was intended only for libraries) and everyone of them could store and modify their documents on their Raspberries. These could be accessed with a web interface based on Gitea, with a domain like 999.documentiaperti.org (999 was the Raspberry Pi's unique ID). In this case the job given to the Raspberry was not only to manage a web storage system but also to manage the elaboration of data. Next, for adding another feature to the Raspberry Pi, we decided that it would be also a tool for extracting images from the portable scanner, so we provided it with an LCD screen where everyone could manage their projects (or repositories), without having to access the web interface, by adding the extracted data to them.

The first change

While we were developing our software we were informed about that we also had to manage a server, so our decentralized system was slightly changed and we thought about the merging of the two system models. In that case the Raspberry Pi would have been the same as before and the server would've become a backup system for mirroring the projects inside those, offering the latter to people in every moment, without having the limitation of upload speed of libraries' Internet connection. Also it needed to manage the acquisition of the subdomains by Raspberries. Anyways, in this scenario every system (Raspberries and also the server) would've acted as an indipendent Gitea-based system.

The second change

After we nearly finished our development on Raspberry Pi's system we passed on the server development but we noticed that Gitea had problems with Reposistories' mirroring and our project needed to be changed again. Also we didn't realize that if every system acted as an indipendent system, for example, if someone wanted to open an issue or make a pull request after correcting a bad elaboration, it affected only the system where the user made the request because those functionalities weren't synchronized between different systems. So we came to the final idea that is completely different to the beginning one.

The final idea

The final idea is constituted by a single centralized Gitea system that is the server and contains all the repositories, so that everyone can access to it in every moment and share their issues and pull requests without having to manage synchronization. Also, because the elaboration of data with the Raspberry required about 2 minutes whereas the server required only about 20-30 seconds, we decided to move the elaboration on server-side. This change didn't bring only significant improvements in time but modified a crucial part of the actual project because with "Documenti Aperti" we mean not only open documents but also an open community where everyone can help each other by adding new documents. Indeed, now that the job of the elaborations has been moved to the server everyone, and not only libraries, can share their own documents, elaborate them and keep them readable whenever needed. At the end, we also added a connection to archive.org, the world largest archive of the Internet, and we have provided to the web interface a simple tool to easily modify .hocr files (elaboration data). For these facts the Raspberry Pi became an extra tool that complements portable scanners, so that it extracts data and sends it to the server with a request of elaboration whenever an Internet connection is avaible

The ending of the project

We ended up using these means:

  • The portable IRIScan book 5 wifi model
  • A Raspberry Pi 3 model B (now it has an accessory function and is not necessary)
  • Two USB wifi adapter
  • The documentiaperti server offered by Open Sensor Data