Web Books - internetarchive/openlibrary GitHub Wiki

This is an outline for #9625

First, read http://openlibrary.org/trusted-book-providers.

Table of Content

  • What is (and isn't) a Web Book
  • How to import a single Web Book (as a librarian)
  • How to use the bookmarklet
  • Using the Import Queue to import Web Books
  • The json schema (providers) for Web Books

Open Library Web Book Guidelines

Current scope of web books

  • A book available in full to read online in your web browser. We prefer books that don't require downloading, but PDFs/EPUBs may be considered on a case-by-case basis, if they look like valuable, high-quality resources.
  • Non-fiction books, educational books, or textbooks for now
  • Well-edited ; in a brief scan, there should not be any typos / grammatical errors
  • Long books: Currently we will only consider books longer than ~20 pages
  • Is an official publication by or with permission from the author

When to not link a web book

  • Unofficial copies of a book
  • Incomplete or in-progress books
  • Copies available through existing Trusted Book Providers (eg Project Gutenberg, LibriVox, etc.). These should be imported programmatically.
  • Partially completed translations (eg https://git-scm.com/book/en/v2)
  • Unofficial fan fiction
  • Books shorter than ~20 pages
  • Web books with many ads
  • Web books which require a subscription or submitting an email before getting access to the book

Some of these rules might be loosened in the future, but while this project is in pilot, these are what we are looking for.

Examples

FAQ

When should I create a separate edition?

A separate edition should be created when a meaningful change has been made. Similar to when you would create changes for print books. E.g. translation, big content changes. This can be a bit tricky with web books, since it's not always clear when changes have been made. 

If the primary author has changed, this should be added as a new work.

What should I do about covers?

Covers can be a bit tricky, since many web books don't have a fixed image for a cover. You might have to do something creative like taking a clever screenshot; like https://openlibrary.org/books/OL40220570M/Deep_Learning . The web book here doesn't have a cover, but by resizing the window we can capture a screenshot of the graphic used at a nice aspect ratio. You might notice that web book covers won't have the portrait aspect ratios we're often used to with books, but might be square or even landscape!

DO NOT create/design entirely new covers for web books that are missing covers. Only small tweaks/variations of text/images already on the web book should be used as covers.

Which link should I link to?

In general we like to link directly to the book; not to an "about page" about the book. This makes it easier for users and avoids them having to hunt around for the "Read Online" button, which can appear anywhere on a third-party site.

E.g. for http://openlibrary.org/books/OL35722376M , we link to the first page, https://poignant.guide/book/chapter-1.html , instead of the book's homepage, https://poignant.guide/book .

Exception: If linking directly to the book contents does not provide a way to view other chapters, the contents, etc, then link to the book's homepage.

E.g. https://www.deeplearningbook.org/ has as its first page maybe https://www.deeplearningbook.org/contents/TOC.html , but from this there is no way to get to any other page! So the read link is instead the book's homepage.

Do web books need to have official identifiers like ISBN to be included?

No, books can still qualify as web books even if they do not have an ISBN. In fact most web books will likely not have an ISBN.

How can I find the publication date if one isn't listed?

If one isn't listed, you can use the WayBackMachine to try to find the year when it was first made available. Be sure to find the year when it was first made available in full; sometimes web books are put online while they are still in development, but we consider them "published" only once they have been completed.

For example, for the web books https://selectstarsql.com/ , you can prefix the url with waybackmachine.org/ to see it at way back machine. So waybackmachine.org/https://selectstarsql.com/ .

What do about books in multiple languages?

If a web book is available in multiple languages, create a separate edition for each language. Only fully completed languages should be created.

For example: https://git-scm.com/book/en/v2 should have editions for each of the languages listed in the "Full translation available in" section.