Understanding The Data Model - internetarchive/openlibrary GitHub Wiki
Each Infogami page (i.e. something with a URL) has an associated type. Each type contains a schema that states what fields can be used with it and what format those fields are in. Those are used to generate view and edit templates which can then be further customized as a particular type requires.
Infogami provides a generic way through it's wiki to create new types as needed.
Aside from the tables listed here, Open Library in essence only really has only two database tables. By default they will have the same pretty basic functionality through Infogami
The thing table defines types like editions, works authors, users, languages. The thing table also keeps track of instances of things by their identifiers it basically registers their IDs in the table as an instance.
Entries in a sample thing table
id | key | type | latest_revision | created | last_modified |
---|---|---|---|---|---|
2 | /type/key | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
3 | /type/string | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
4 | /type/text | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
5 | /type/int | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
The data table on the other hand maps one of these types to all of the data associated with it Infogami provides a generic way through it's wiki to create new types as are needed
Entry in a sample data table
thing_id | revision | data |
---|---|---|
1 | 1 | {"created": {"type": "/type/datetime", "value": "2013-03-20T10:27:01.223351"}, "last_modified": {"type": "/type/datetime", " value": "2013-03-20T10:27:01.223351"}, "latest_revision": 1, "key": "/type/type", "type": {"key": "/type/type"}, "id": 1, "revision": 1} |
Read further about Infogami and type on: https://openlibrary.org/dev/docs/infogami
Open Library has a number of additional tables that are used to support a variety of features. The DDL for these tables can be found here.
These tables are used to store the books that patrons have on their "Want to Read", "Currently Reading", and "Already Read" reading log shelves. The bookshelves_books
table holds most of this data, with bookshelves
acting as a look-up table for shelf names.
bookshelves.py
provides functions which interact with the reading log tables.
This table stores the target
number of books that a patron commits to reading in a given year. Functions which interact with the yearly_reading_goals
table can be found in yearly_reading_goals.py
.
A patron can track the last date that they have finished any book that is on their "Already Read" shelf. The bookshelves_events
table stores these dates, and may later be used to store other dates that a patron may want to track (date they started reading the book, start and finish dates of other times that they have read a book, etc.).
Related code can be found in bookshelves_events.py
.
Patron's can give structured reviews of books by attaching any number of pre-defined tags to a work. These are stored in the observations
table.
The code that interacts with this table, as well as the definitions for the tags, are found in observations.py
.
A patron can add private notes that only they can read to any work. The booknotes
table stores these notes. booknotes.py
contains the code that interacts with this table.
Patrons can submit a star rating for a work. The ratings
table holds these star ratings. Consult ratings.py
for related code.
This table holds librarian requests, which in turn are used to populate the librarian request table at https://openlibrary.org/merges. Code which interacts directly with thus table can be found in edits.py
.
web.py (the python micro-web framework we use, similar to flask) maintains a ctx variable which maintains the context of the system during/across a request. Web.py also has a web.db connection to our postgres database.
infogami sits on top of web.py -- it's like a battery pack. One piece of infogami is called infobase which behaves like an ORM (db wrapper) to allow us to define arbitrary data types like works, editions, authors, etc.
At the simplest level, Infobase works by relying on 2 tables: things and data.
things gives every object in our system and ID, a type, and a reference to its data in the data table.
data is just a massive catalog of json data that can be references by querying and joining things
infogami injects a utility called site into web.py's ctx (https://webpy.org/cookbook/ctx) variable (ctx maintains information and connections specific to the current client). The site utility handles all the joins for you so you can request and key from the things table, fetch all its corresponding data, and also leverage and models and functions we have defined for that thing's type.