Data Model

Adding new Tables

Interested in adding new table to our schema? Check out this reference PR: https://github.com/internetarchive/openlibrary/pull/7928/files

Querying for Data

The bookshelves core model shows us how we can use a database connection on the backend to query for data

from openlibrary.core import db
oldb = db.get_db() # i.e. web.database(**web.config.db_parameters)
query = "SELECT count(*) from bookshelves_books"
oldb.query(query)

Fetching Things Individually or in Bulk

From within routers/controllers, it's much more common to use the web.ctx.site object to fetch individual or multiple records.

doc = web.ctx.site.get("/works/OL5285479W")
keys = ["/works/OL5285479W", "/works/OL257943W", "/works/OL27448W"]
docs = web.ctx.site.get_many(keys)

Understanding Infogami, Infobase, and Web.py

Open Library is built using a wiki engine called infogami which sits on top of the web.py python micro-web framework (comparable to flask). Web.py uses a variable called web.ctx to maintain the context of the application during/across a http request. Web.py also maintains a postgres database connection using web.db. Infogami extends and wraps the web.db controller by offering a system called infobase which behaves like an ORM (db wrapper) to allow us to define arbitrary data types like works, editions, authors, etc.

At the simplest level, Infobase works by relying on 2 tables: things and data:

things gives every object in our system and ID, a type, and a reference to its data in the data table.
data is just a massive catalog of json data that can be references by querying and joining things

Infogami injects a utility called site into web.py's web.ctx (https://webpy.org/cookbook/ctx) variable (ctx maintains information and connections specific to the current client). The web.ctx.site utility handles queries and joins for you so you can request and key from the things table, fetch all its corresponding data, and also leverage and models and functions we have defined for that thing's type.

Infogami Database Schema

Every Infogami page on Open Library (i.e. something with a URL) has an associated type. Each type contains a schema that states what fields can be used with it and what format those fields are in. Those are used to generate view and edit templates which can then be further customized as a particular type requires. Infogami provides a generic way through it's wiki to create new types as needed.

Aside from the tables listed here, Open Library in essence only really has only two database tables. By default they will have the same pretty basic functionality through Infogami

Thing table

The thing table defines types like editions, works authors, users, languages. The thing table also keeps track of instances of things by their identifiers it basically registers their IDs in the table as an instance.

Entries in a sample thing table

id	key	type	latest_revision	created	last_modified
2	/type/key	1	1	2013-03-20 10:27:01.322813	2013-03-20 10:27:01.322813
3	/type/string	1	1	2013-03-20 10:27:01.322813	2013-03-20 10:27:01.322813
4	/type/text	1	1	2013-03-20 10:27:01.322813	2013-03-20 10:27:01.322813
5	/type/int	1	1	2013-03-20 10:27:01.322813	2013-03-20 10:27:01.322813

Data table

The data table on the other hand maps one of these types to all of the data associated with it Infogami provides a generic way through it's wiki to create new types as are needed

Entry in a sample data table

thing_id	revision	data
1	1	{"created": {"type": "/type/datetime", "value": "2013-03-20T10:27:01.223351"}, "last_modified": {"type": "/type/datetime", " value": "2013-03-20T10:27:01.223351"}, "latest_revision": 1, "key": "/type/type", "type": {"key": "/type/type"}, "id": 1, "revision": 1}

Read further about Infogami and type on: https://openlibrary.org/dev/docs/infogami

Open Library Feature Tables

Open Library has a number of additional tables that are used to support a variety of features. The DDL for these tables can be found here.

Screenshot from 2023-12-14 11-19-48

`bookshelves` and `bookshelves_books`

Screenshot from 2023-12-14 11-20-02

These tables are used to store the books that patrons have on their "Want to Read", "Currently Reading", and "Already Read" reading log shelves. The bookshelves_books table holds most of this data, with bookshelves acting as a look-up table for shelf names.

bookshelves.py provides functions which interact with the reading log tables.

`yearly_reading_goals`

yearly_reading_goals

This table stores the target number of books that a patron commits to reading in a given year. Functions which interact with the yearly_reading_goals table can be found in yearly_reading_goals.py.

`bookshelves_events`

check_ins

A patron can track the last date that they have finished any book that is on their "Already Read" shelf. The bookshelves_events table stores these dates, and may later be used to store other dates that a patron may want to track (date they started reading the book, start and finish dates of other times that they have read a book, etc.).

Related code can be found in bookshelves_events.py.

`observations`

observations

Patron's can give structured reviews of books by attaching any number of pre-defined tags to a work. These are stored in the observations table.

The code that interacts with this table, as well as the definitions for the tags, are found in observations.py.

`booknotes`

booknotes

A patron can add private notes that only they can read to any work. The booknotes table stores these notes. booknotes.py contains the code that interacts with this table.

`ratings`

ratings

Patrons can submit a star rating for a work. The ratings table holds these star ratings. Consult ratings.py for related code.

`community_edits_queue`

merge_queue

This table holds librarian requests, which in turn are used to populate the librarian request table at https://openlibrary.org/merges. Code which interacts directly with thus table can be found in edits.py.

Understanding The Data Model - internetarchive/openlibrary GitHub Wiki

Data Model

Adding new Tables

Querying for Data

Fetching Things Individually or in Bulk

Understanding Infogami, Infobase, and Web.py

Infogami Database Schema

Thing table

Data table

Open Library Feature Tables

`bookshelves` and `bookshelves_books`

`yearly_reading_goals`

`bookshelves_events`

`observations`

`booknotes`

`ratings`

`community_edits_queue`

⚠️ GitHub.com Fallback ⚠️

Understanding The Data Model - internetarchive/openlibrary GitHub Wiki

Data Model

Adding new Tables

Querying for Data

Fetching Things Individually or in Bulk

Understanding Infogami, Infobase, and Web.py

Infogami Database Schema

Thing table

Data table

Open Library Feature Tables

bookshelves and bookshelves_books

yearly_reading_goals

bookshelves_events

observations

booknotes

ratings

community_edits_queue

⚠️ **GitHub.com Fallback** ⚠️

`bookshelves` and `bookshelves_books`

`yearly_reading_goals`

`bookshelves_events`

`observations`

`booknotes`

`ratings`

`community_edits_queue`

⚠️ GitHub.com Fallback ⚠️