Extension Refactor and Redesign - QuantEcon/sphinxcontrib-jupyter GitHub Wiki
An update to thinking on the refactor and redesign to document some decisions made.
We have two streams of Translators:
- Code Block (Sparse Translators) -> we are providing a
SphinxSparseTranslator
- Notebook (SphinxTranslators)
JupyterIPYNBTranslator
will be setup as the base Translator to provide a notebook that contains markdown and code-block elemeents. This translator acts as a parent class to PDF
, and HTML
which override the methods required to produce notebooks suitable for website and pdf construction.
Code Execution is taken care of by a new execute
builder. We have implemented a codetree
to provide caching for code execution prior to the translation stage. All translators are able to get output attached to the notebook objects from the execute
builder.
This page documents some of the decisions and ideas for the upcoming refactor and redesign of the extension. The aim of this work will be to use the lessons learnt from past decisions and refactor the extension into a more logical design that will be easier to maintain in the long term. The aim of this work will be to release version 1.0
.
Current Docs: https://sphinxcontrib-jupyter.readthedocs.io/en/latest/
-
jupyter
-> Targets readable Jupyter notebooks (with options to support more advanced features such as tables with html) -
jupyterhtml
-> Targets the construction of a website (html pages, download notebooks, coverage badges, html themes) -
jupyterpdf
-> Targets the construction of PDF files via LaTeX
each builder
will have its own entry point and will target different folders in _build
and each builder will have its own translator
to keep the pathways logically separate. This refactoring will greatly reduce the number of required options in conf.py
as each builder is a specialised compilation pipeline. Emphasis will be on reducing requried configuration (for example we can enforce a theme that requires html and pdf templates to be in a specific location).
Discussion/Questions:
- Should we have
jupytercoverage
orjupytertest
for coverage execution testing to supportjupyterhtml
and/or as a standalone tool for reporting execution errors. - Jupyter is used as an intermediate format only as it provides an execution layer. One idea would be to rework
code-block
execution at thesphinx
transform layer to alleviate the need to useJupyter
in this way. However one benefit of using it as an intermediate layer is that we could supportJupyter
sources easily which is a big plus.
The primary use case that the jupyter
builder should be able to support include constructing readeable notebooks with an emphasis on markdown inclusion. These can then be used for generating notebook sets for tutorials, lectures and courses.
Option | Description |
---|---|
jupyter_conversion_mode |
all or code
|
jupyter_static_path | Specify static file path |
jupyter_language | Specify default programming language (python3) |
jupyter_language_synonyms | parsing blocks for python3 and ipython |
jupyter_solution_notebooks | Build solution notebooks that include tagged solutions for code-blocks |
or
jupyter = {
static_path : <path for static folder>,
conversion_mode : 'all',
language : 'python3'
language_synonyms: ['ipython'],
}
Notes:
-
jupyter_static_path
won't be needed if we build a library of static assets (see discussion below) - Remove
jupyter_header_block
? - Remove
jupyter_language
and infer? Perhaps combine withjupyter_kernel = python3
-
jupyter_course_solutions
is different to current implementation to drop solutions as that approach requires two runs ofsphinx
to be desired collections.
Should general options be specified at this level such as author
:
Option | Description |
---|---|
jupyter_author |
Consider A: Lecture/Course Support
Redesign course / lecture support to be more specialised and useful and perhaps include as a separate builder jupyterlectures
. Options such as jupyter_drop_solutions
would not be required, if solutions
class is found then it would generate two sets of notebooks 1. base set for lecture, and 2. a solutions set which includes solution cells.
The primary use case that the jupyterhtml
builder should be able to support generation of websites (using Jupyter) as an intermediate format.
Option | Description |
---|---|
jupyterhtml_template | Specify conversion template |
jupyterhtml_downloadnb | Generate download notebooks |
jupyterhtml_downloadnb_urlpath | Specify online server for internet based assets |
or
jupyterhtml = {
theme : 'minimal',
download_notebooks : True/False,
download_notebooks_urlpath : <html path to server assets for images>,
}
Notes:
- items like
jupyter_generate_html
,jupyter_make_site
, will not be required when it is a specialised build pathway. -
template
paths not required if we enforce a theme structure forjupinx
projects
Update writers to use sphinx.util.docutils
base class for Translators.
https://github.com/sphinx-doc/sphinx/blob/8c7faed6fcbc6b7d40f497698cb80fc10aee1ab3/sphinx/util/docutils.py#L429
We will have:
-
JupyterCodeTranslator
-> code-only-
JupyterTranslator
-> support for fullipynb
representation ofrst
-
JupyterHTMLTranslator
->html
-
JupyterPDFTranslator
->pdf
-
The primary use case that the jupyterpdf
builder should be able to support generation of PDF files (using Jupyter) as an intermediate format. This includes individual PDF files of each RST File as well as a book style PDF of the whole project.
Option | Description |
---|---|
jupyterpdf_template | Specify path to conversion template |
jupyterpdf_bibfile | Specify path to bibfile location |
jupyterpdf_author | |
jupyterpdf_urlpath | Specify urlpath for external links for externally hosted content |
or
jupyterpdf = {
template : <path>,
bibfile : <path>,
urlpath : <path>,
}
Notes:
- Should
jupyterpdf_author
be handled asjupyter_author
? - It is a bit strange to require usage of
theme
here as none of the theme is useful except for thepdf
conversion template. Perhaps thepdf
template should be divorced from thehtml
theme? Or at the very least we should specify pdf template path rather than a theme.
Different Builders: (Accepted)
Refactoring into the different builders will alleviate the current confusion around setting options and their effect. Another option set we could consider is to use a pipeline option approach for collections of options. Related Issues https://github.com/QuantEcon/sphinxcontrib-jupyter/issues/199.
We will minimise code duplication across the different translators through inheritance from a base class (as all require code-block handling):
Translator Classes
`JupyterCode`
-> `JupyterNotebook`, `JupyterHTML`, `JupyterPDF`
Notebook Executor: (Develop)
We should write notebook execution as a utility that all classes can use to manage notebook execution in a consistent way. The utility will rely on dask[distributed]
. We can then add cached
notebook execution based on content changes. It would be nice if it can support:
- parsing output blocks for error handling and testing at the cellblock level.
Static Asset Managment: (Develop)
Management of static assets needs to be researched to see how we can integrate more closely with Sphinx internal management of ref
and uri
objects in the document tree. We should make use of sphinx as much as possible for managing these link types catering to:
- flat stuctures
- nested folder structures
- ability to have local
static
folders for lecture series use case
Pandoc: (Consider)
Investigate greater use of pandoc
directly for conversions? Is it useful to consider converting RST to MARKDOWN via pandoc as it may make IPYNB -> HTML conversion easier?
Installable Themes: (Consider)
If themes and templates were installable this would save on user configuration requirements in theme
and/or templates
. [Low Priority]
Execution:
Investigate transforms
as an opportunity to develop an execution engine for each code-block. It might be useful to build DAG
representation of the various target types (i.e. jupyter and coverage notebooks) to greatly reduce the amount of computation required across the various pipelines. If this was possible we could support code-block
execution and then target html
, latex
and jupyter
natively through their respective writers with additional code to handle interfaces to executed code-blocks.
http://docutils.sourceforge.net/docs/ref/transforms.html
Review collaborative opportunities with: https://jupyter-sphinx.readthedocs.io/en/latest/, https://github.com/jupyter/jupyter-sphinx