Developer's Starter Guide - internetarchive/openlibrary GitHub Wiki
The intended audience of this guide are first time contributors who are trying to understand the Open Library codebase.
Before starting this guide, it is recommended that you have:
- cloned or forked and pulled an up-to-date version of the Open Library repository
- successfully launched your local environment using the docker guide
- completed the git cheat sheet tutorial
- found a good first issue to work on
Now you're ready to understand how the code is structured within Open Library.
Working backwards from what you see on the site. In this guide, we're going to work backwards from what visitors see on the Open Library website and then explore the sequence of actions (the "lifecycle") that takes place in order for the html to be served.
When a patron navigates through the Open Library website and requests pages (like the screenshot pictured below), the patron will be served "rendered" html content from the openlibrary/templates
directory.
Here are a few examples of common templates:
- Homepage:
home/index.html
, which gets rendered here - Books Page(s):
type/edition/view.html
for both editions (e.g. https://openlibrary.org/books/OL30162974M) or works (https://openlibrary.org/works/OL257943W), - Subjects Page(s):
subjects.html
(e.g. openlibrary.org/subjects/climbing) - Search:
work_search.html
(e.g. openlibrary.org/search) - Login & Register:
login.html
andaccount/create.html
Note
Some pages, like the /books page, are first-class "registered" Infogami types
and are powered through the Infogami wiki framework that runs Open Library. This is true for books
, authors
, lists
, and a few other types. Because these are first-class registered types handled by infogami, You won't find plugin controllers for them in Open Library. Their html templates can be found in the special templates/type
directory. When Open Library notices a url containing a key for one of these special types, infogami fetches the corresponding object from the database and passes it as page
straight into the template. So for the books page, which is templates/type/edition/view.html
, the page
value defined in its $def with (page, ...)
header is actually a book
edition or work object whose properties are defined here.
You may notice that most the html pages within the templates folder don't include the <html>
, <head>
, and <footer>
sections of the website, just the contents of the body. This is because Open Library is set up such that nearly all page
templates get automatically wrapped by the site
template.
While the body of each Open Library page may present differently, most pages share similar scaffolding, look, and feel. For instance, nearly every page will have:
- A top black bar
topNotice
with an Internet Archive logo:
- A header
header#header-bar
section with the Open Library logo, a search box, a hamburger menu, and an account dropper for logged in patrons:
- A
footer
menu:
The top-level html page that is used to define and render the overall structure of every Open Library page is templates/site.html
. In its definition, this main site
template calls out to other modular, specific "sub-templates" (such as the 3 described above) defined within the templates/site
and templates/lib
directories, which combine to form the "site". You can read more about this "layout template" philosophy in the official web.py documentation, the micro web framework used by Open Library.
The variable section of the site
that changes depending on the requested page
is the body
. This page
is an html template that gets passed in to the site
. You may wish to review examples in the Page Templates section.
When we want to serve a rendered template to a visitor, we typically use the render_template()
function. The first argument of render_template
will be be the name of the page-specific html template we want to be rendered and injected into the site
body. The rest of the parameters to render_template
, after the filename of the template, are used to pass variables from python into the corresponding html template file, as defined in the template's header.
For instance, imagine we have a simple template called templates/book.html
which requires a book_title
, a bookcover_url
, and an author_name
. Such a template might look like:
$def with (book_title, bookcover_url, author=None)
<div>
<h1>$book_title</h1>
<img src="$bookcover_url"/>
$if author:
<h2>$author</h2>
</div>
The $def with(...)
line is how the template declares what variables it needs to be passed in when it is being called and rendered. Elsewhere in the template, the $
symbol acts as an instruction that a variable should be replaced by its value when it's being rendered.
Check the official web.py templator documentation if you'd like a deep dive into all the capabilities of Open Library's python-powered templating system.
Caution
The $:
combination instructs the template to render this variable as html, as opposed to plaintext. This is sometimes necessary when rendering one html template from another, but should be done with care when rendering a variable whose value may be user-specified because this may result in dangerous XSS attacks, where a bad actor may intentionally save javascript code into a variable, hoping that the website will render and execute this code as html as opposed to safe plaintext.
Note
Notice that author
in this example is provided as a keyword argument with a default value of None
and thus is optional when the template is being called. Similar to when defining a python function, once a keyword argument is defined, all following variables after it must also then be keyword arguments.
From python, the corresponding code to render this templates/books.html
template would be:
render_template(
'books', # name of the template in templates/ directory without .html
book_title="The Hobbit",
bookcover_url="https://covers.openlibrary.org/b/id/14624642-L.jpg",
author="J.R.R. Tolkien"
)
Tip
When we call render_template
, it knows to look in the templates/
directory and so we don't need to specify this. Also, we don't need to include the .html
extension.
When we call render_template('book', ...)
we are fetching the templates/book.html
template and passing in the values book_title
, bookcover_url
, and author
from python into the template's $def with (...)
section, where it can be substituted as needed into the $variables
in the template. The render_template
takes this prepared book
template and then passes it into site.html
where it will be forwarded to the body
to be rendered.
When a single template becomes too large and unmanageable, or when the programmer recognizes an opportunity to clean up a template by factoring-out common or shared pieces of logic, they may choose to break a template into smaller logical components and move them into their own separate micro-templates. These micro-templates can live beside other templates in the templates/
directory. If these micro-templates are very self-contained and make sense to be rendered as their own widget, such the QueryCarousel
, you might consider saving your template in the macros
directory, which is folder of templates that have a few additional special properties.
When refactoring programmer may decide it makes sense to move the code to the macros
directory. A macro
is simply a special type of template
that can be accessed using {{}}
syntax by the page editor in Open Library. You can read more about macros in our very stale/outdated infogami documentation.
Here's what the infogami edit UI looks like for a collection:
Here's what the page looks like when it's rendered with a macro:
So far, by extrapolating from what's written in this guide, we hope you've learned how you may be able to:
- Find the corresponding html template file for an Open Library webpage somewhere in the
templates/
directory - How to make a change to the site wrapper, should you need to e.g. change the header or footer
- In theory, add a new template and render it... But from where?
The next step is learning where to add our render_template(...)
code and how to connect a url that a patron visits with logic to respond to this request.
To answer this question, you'll want to proceed to the Plugins Guide.