Template README - pglevy/github-for-creators GitHub Wiki
ebook-template
Overview
This repository is a template for a project that'll build an eBook (in ePub, PDF, Microsoft Word and HTML form) from Markdown input files.
tl;dr: You write your book as a series of Markdown files, adhering to some
file naming conventions, and you run the ./build
command (see Building your book) to build your book.
There are sample files in this repository, so you can build a (completely pointless and utterly useless) eBook right away.
This tooling has been tested with Pandoc versions 2.0.4 and 2.0.5.
If you're impatient, jump to Getting Started.
What's where
-
Your book's Markdown sources, cover image, and some metadata go in the
booksubdirectory. This is where you'll be doing your editing. -
The
filessubdirectory contains files used by the build. For instance, the HTML and ePub style sheets are there, as are LaTeX templates (used for PDF output) and a Microsoft Word style reference document. You shouldn't need to touch anything infiles. -
The
scriptssubdirectory currently just contains a Pandoc filter used to provide enhanced markup. You shouldn't need to touch anything inscripts. -
The
libdirectory contains some additional Python code used by the build. Ignore it. -
Your book output files (
book.docx,book.epub,book.pdfandbook.html) are generated in the topmost directory. -
The build will also generate a subdirectory called
tmpto hold some temporary files. Git is configured to ignore that directory.
Supported output formats
This tooling will generate your book in the following formats:
ePub
book.epub
ePub is the format used by Apple's iBooks and various free readers, including Calibre.
book.pdf is a single PDF document, generated by LaTeX or Weasy Print.
Issues:
- LaTeX PDF generation uses the LaTeX "article" document class, rather than the seemingly more suitable "book" class, because the "book" class, combined with Pandoc's LaTeX generation, is just a little too funky.
- Weasy Print-generated PDF has no table of contents.
HTML
book.html is a single-page HTML, styled in a pleasant format.
Microsoft Word
book.docx is a Microsoft Word version of your book.
Issues:
- There's no table of contents.
- The cover image is not included in the Word document.
Unsupported formats
Kindle (MOBI)
Pandoc can't generate books in Kindle format. However, there are several options for generating Kindle content:
-
Haul the Microsoft Word version into Kindle Create
-
Use the free and open source Calibre suite to convert the ePub format to Kindle format.
Getting started
Start by downloading and unpacking the latest release of this repository. (By downloading a release, instead of cloning the repository, you can more easily create your own Git repository from the results.)
Then, install the required software and update the configuration files.
Can I use Docker? Why, yes!
If you don't want to install the dependencies on your machine, you can create
a Docker image to isolate them. Originally courtesy of
@szaffarano, and modified more lately,
there's a ./build-docker script in the top-level directory.
Instead of running ./build to build your book, simply run ./build-docker,
instead. When you run it, the script will pull the latest Docker image from
bclapper/ebook-template (on Docker Hub), and it will use Docker to build
your book.
Using this approach guarantees a consistent environment that has the right versions of Python, Pandoc, and the other tools.
Upgrading
If you're already using this tooling for one of your books, and you want to upgrade to a newer version, the process (currently) is straightforward:
- Download and unpack the new version, as described above. Don't unpack it over your project!
- Run the new version's
upgrade.pyfile from your project's top level directory. - Run
./upgrade.pyfrom within your project, passing it the path to the unpacked new release.
For example:
cd /tmp
tar xf /path/to/downloaded/ebook-template-X.Y.Z.tgz
cd /path/to/your/ebook
/tmp/ebook-template-X.Y.Z/upgrade.py /tmp/ebook-template-X.Y.Z
Note that this copies files, removing ones that aren't necessary any more.
If there are metadata changes, however, upgrade.py won't apply them.
Be sure to read the change log for the new release.
Required software
- Install pandoc.
- Install a Python distribution, version 3.6 or better.
- On Mac OS,
brew install python3will suffice. - On Ubuntu/Debian, this article might help.
- On Windows, see https://www.python.org/downloads/windows/.
- On Mac OS,
- I recommend creating and activating a Python virtual environment, to keep the installed version of Python 3 more or less pristine.
- Once you have your Python 3 environment set up (and activated, if you're
using a virtual environment), install the required Python packages with
pip install -r requirements.txt - You can generate PDF via either LaTeX or Weasy Print. There are
advantages and disadvantages to each; see Generating PDF,
below.
- If you'll be using LaTeX, install a TexLive distribution, as detailed below.
- If you'll be using Weasy Print, make sure your Python 3 environment is activated, and follow the directions at http://weasyprint.readthedocs.io/en/latest/install.html
Installing TexLive
- On Mac OS, use MacTex,
and ensure that
/Library/TeX/texbinis in your path. - On Ubuntu/Debian, install
texlive,texlive-latex-recommendedandtexlive-latex-extras. - On Windows, this might work: https://www.tug.org/texlive/windows.html.
WARNING: I avoid Windows as much as possible. I do not (and, likely, never will) test this stuff on Windows. If you insist on using that platform, you're more or less on your own.
Initial configuration
Create your cover image
In your book directory, create a cover image, as a PNG. If you haven't
settled on a cover image yet, you can use the dummy image that's already
there. Currently, the cover image is not optional.
Fill in the metadata
Edit book/metadata.yaml, and fill in the relevant pieces. Both Pandoc
and the build tooling use this metadata.
Note: This file contains Pandoc YAML Metadata, with some additional fields used by this build tooling.
The following elements are required.
-
title(Required): The book title. -
subtitle(Optional): Subtitle, if any. -
author(Required): A YAML list of authors. If there is only one author, use a single-element YAML list. For example:
author:
- Joe Horrid
author:
- Joe Horrid
- Frances Horrid
-
copyright(Required): A block with two required fields,ownerandyear. See the existing samplemetadata.yamlfor an example. -
publisher(Required): The publisher of the book. -
language(Required): The language in which the book is written. The value can be a 2-letter ISO 639-1 code, such as "en" or "fr". It can also be a 2-part string consisting of the ISO 639-1 language code and the 2-letter ISO 3166 country code, such as "en-US", "en-UK", "fr-CA", "fr-FR", etc. -
genre(Required): The book's genre. See https://wiki.mobileread.com/wiki/Genre for a list of genres.
Edit the copyright information
Edit the book/copyright.md file. You can leave % tokens in there; they'll
be substituted as described, below, in Additional markup.
The meaning of the {<} is also explained in that section.
Markup notes
Your book will use Markdown, as interpreted by Pandoc. The following Pandoc extensions are enabled. See the Pandoc User's Guide for full details.
-
line_blocks: Use vertical bars to create lines that are formatted as is. See http://pandoc.org/MANUAL.html#line-blocks for details. -
escaped_line_breaks: A backslash followed by a newline is also a hard line break. See http://pandoc.org/MANUAL.html#extension-escaped_line_breaks for details. -
yaml_metadata_block: Allows metadata in the Markdown. See See http://pandoc.org/MANUAL.html#extension-yaml_metadata_block for details. -
smart: Interprets straight quotes as curly quotes, "---" as em-dashes, "--" as en-dashes, and "..." as ellipses. Nonbreaking spaces are inserted after certain abbreviations, such as "Mr.". See http://pandoc.org/MANUAL.html#extension-smart for details. -
backtick_code_blocks,fenced_code_blocksandfenced_code_attributes: Allows fenced code blocks, using backticks (GitHub Flavored Markdown-style) and tildes (~~~). You can also supply attributes (classes, for instance). See http://pandoc.org/MANUAL.html#extension-fenced_code_blocks, http://pandoc.org/MANUAL.html#extension-fenced_code_attributes and http://pandoc.org/MANUAL.html#extension-backtick_code_blocks for details.
Additional markup
The build tool uses a Pandoc filter
(in scripts/pandoc-filter.py) to enrich the Markdown slightly:
- Level 1 headings denote new chapters and force a new page.
- If you want to force a new page without starting a new chapter, just
include an empty level-1 header (
#). Seebook/copyright-template.mdfor an example. - A paragraph containing just the line
+++is replaced by a centered line containing "• • •". This is a useful separator. - A paragraph that starts with
{<}followed by at least one space is left-justified. Seebook/copyright-template.mdfor an example. - A paragraph that starts with
{>}followed by at least one space is right-justified. - A paragraph that starts with
{|}followed by at least one space is centered.
Note, too, that Pandoc automatically converts your quotation marks into
smart quotes, triple dots (...) into an ellipsis, and two dashes (--)
into an em-dash.
(The filter is written in Python, using the Panflute package.)
Support for PlantUML
If you set use_plantuml to true in your metadata, you can use
PlantUML diagrams in your book, using special
fenced code blocks. For instance:
~~~plantuml
@startuml
client->server: SYN
server->client: SYN+ACK
client->server: ACK
@enduml
~~~
You can use either tildes or backticks, and you can also use Pandoc-style fenced code blocks to supply attributes. If you specify an "alt" attribute or a "title" attribute, it will be used as the title for the image (for readers that display the title). If you specify both, "title" is preferred. For example:
~~~ {.plantuml title="4-way handshake"}
@startuml
client->server: FIN
server->client: ACK
server->client: FIN
client->server: ACK
@enduml
~~~
Chapter 1 of the sample book contains these two examples.
Book source file names
The tooling expects your book's Markdown sources to be in the book
subdirectory and to adhere to the following conventions:
-
All files must have the extension
.md. -
If you create a file called
dedication.md, it'll be placed right after the copyright page in the generated output. Seededication.mdfor an example. If you don't want a dedication, simply delete the provideddedication.md. -
If your book has a foreward, just create file
foreward.md, and it'll be inserted right after the dedication. Otherwise, just delete the supplied sampleforeward.md. -
If your book has a preface, just create file
preface.md, and it'll be inserted right after the foreward. Otherwise, just delete the supplied samplepreface.md. -
If the book has a prologue, put it in file
prologue.md. It'll appear before the first chapter. If you don't want a prologue, simply delete the providedprologue.md. -
Keep each chapter in a separate file. (This is easier for editing, source control, etc.) Name the files
chapter-NN.md. For instance,chapter-01.md,chapter-02.md, etc. The chapter files are sorted lexically, so the leading zeros are necessary if you have more than 9 chapters. If you have more than 100 chapters (seriously?), just add another leading zero (e.g.,chapter-001.md). If you must put the entire content in one file, the file's name must start withchapter-and end in.md. -
If the book has an epilogue, put it in file
epilogue.md. It'll follow the last chapter. If you don't want an epilogue, simply delete the providedepilogue.md. -
If you create a file called
acknowledgments.md, it'll be placed after the epilogue. If you don't want an acknowledgements chapter, simply delete the providedacknowledgments.md. -
If you need one or more appendices, just create files that start with
appendix-and end with.md. Note that the files are sorted lexically. There are sample appendix files inbook; delete them if you don't want any appendices. -
If you plan to provide a glossary, create
glossary.md. If you don't need a glossary, delete the provided sample file. -
If you want to include an author biography, just create
author.md. -
If you need a references (bibliography) section, create
references.yaml, as described below. If you don't need a bibliography section, just delete the provided samplereferences.yaml.
NOTE: There's currently no support for generating an index.
Summary of chapter ordering
- title page
- dedication (if present)
- foreward (if present)
- preface (if present)
- prologue (if present)
- all chapters
- epilogue (if present)
- acknowledgments (if present)
- appendices (if present)
- glossary (if present)
- author (if present)
- references (if present)
Generating PDF
You can generate PDF via either LaTeX or Weasy Print.
| PDF engine | Advantages | Disadvantages |
|---|---|---|
| LaTeX | rich typesetting, table of contents | LaTeX fonts aren't supported by all printers |
| Weasy Print | good printer font support | no table of contents |
Images
Image references to files are relative to the top directory, not to the
book directory. It's best to stick with PNG images.
Table of contents
-
PDF: If you're using LaTeX, Pandoc automatically generates the table of contents in the PDF. If you're using Weasy Print, there's no table of contents.
-
ePub: Pandoc generates the table of contents as part of the ePub package.
-
HTML: The build tool includes JavaScript that generates a table of contents in the browser.
-
Word: Pandoc doesn't generate a table of contents for Microsoft Word, because it's trivial to create your own. In newer versions of Microsoft Word (e.g., the version you get with Office 365):
- Insert a page break to create a new, blank page.
- Select "References" from the menu bar.
- Select "Table of Contents", and select your desired style.
Bibliographic references
If you're writing a book that needs a bibliography and uses citations in the text, there's a bit of extra work.
First, install pandoc-citeproc.
- On Mac OS, use
brew install pandoc-citeproc. - On Ubuntu/Debian, it should have been installed when you installed
pandoc. - On Windows, it should have been installed when you installed
pandoc.
Next, you'll need to create the bibliography YAML file,
book/references.yaml, suitably organized for pandoc to consume. The sample
book/references.yaml contains a single entry. You can hand-code this file,
or you can use pandoc-citeproc to generate it from an existing bibliographic
file (e.g., a BibTeX file).
See the citations section in the Pandoc User's Guide and the
pandoc-citeproc man page
for more details.
NOTE: The presence of a book/references.yaml file triggers the build
tooling to include a References chapter, to which pandoc will add any
cited works. Your bibliography (book/references.yaml) can contain as many
references as you want; only the ones you actually cite in your text will show
up in the References section. If your text contains no citations, the
References section will be empty. The build tooling does not check first to
see whether you actually have any citations in your text.
An example of a citation is:
[See @WatsonCrick1953]
Again, see the citations section of the Pandoc User's Guide for full details.
Styling your book
The ePub styling uses files/epub.css, and the HTML is styled with
files/html.css.
You can change the styling by providing your own version of those files
in the book directory. That is:
- If
book/html.cssexists, it will be used instead offiles/html.css. - If
book/epub.cssexists, it will be used instead offiles/epub.css.
Building your book
Once you've prepared everything, as described above, you can rebuild the book by running the command:
./build
./build is a Python script using the Python doit
build tool. You should not need to edit it; editing metadata.yaml is
sufficient to specify the information about your book.
Other useful build targets
./build version: Show what version of this tooling you have../build docx: Build just the Microsoft Word version of the book../build pdf: Build just the PDF version of the book../build epub: Build just the ePub version of the book../build html: Build just the HTML version of the book.
You can combine targets:
./build docx pdf
Cleaning up generated files
To clean up the built targets:
./build clean
To clean everything out (except doit-db.json, which won't go away):
./build clobber
Auto-building
Because ./build is a doit script, it supports auto-building. If you
run it as follows:
./build auto
it will build your book (if it's not up-to-date), then wait; any time one or more of the source Markdown files changes, it will automatically rebuild your book. To stop it, just hit Ctrl-C.
NOTE: Auto-building will not detect the addition of new files. For
instance, if you're running in auto-build mode, and you add a new
chapter-03.md file, the build script will not detect it. You'll have to
kill the auto-build and restart it.
Gotchas
But that doesn't always work as expected. For instance, from traditional
make(1) usage, you might expect build clean pdf to run the "clean" target,
then run the "pdf" target. Instead, it just runs the "clean" operation for
the PDF. (That's a doit quirk.)
Copyright and License
This software is copyright © 2017 Brian M. Clapper and is released under the GPL, version 3, similar to the license the underlying Pandoc software uses. See the LICENSE for further details.