Extension Refactor and Redesign - QuantEcon/sphinxcontrib-jupyter GitHub Wiki
An update to thinking on the refactor and redesign to document some decisions made.
We have two streams of Translators:
- Code Block (Sparse Translators) -> we are providing a
SphinxSparseTranslator - Notebook (SphinxTranslators)
JupyterIPYNBTranslator will be setup as the base Translator to provide a notebook that contains markdown and code-block elemeents. This translator acts as a parent class to PDF, and HTML which override the methods required to produce notebooks suitable for website and pdf construction.
Code Execution is taken care of by a new execute builder. We have implemented a codetree to provide caching for code execution prior to the translation stage. All translators are able to get output attached to the notebook objects from the execute builder.
This page documents some of the decisions and ideas for the upcoming refactor and redesign of the extension. The aim of this work will be to use the lessons learnt from past decisions and refactor the extension into a more logical design that will be easier to maintain in the long term. The aim of this work will be to release version 1.0.
Current Docs: https://sphinxcontrib-jupyter.readthedocs.io/en/latest/
-
jupyter-> Targets readable Jupyter notebooks (with options to support more advanced features such as tables with html) -
jupyterhtml-> Targets the construction of a website (html pages, download notebooks, coverage badges, html themes) -
jupyterpdf-> Targets the construction of PDF files via LaTeX
each builder will have its own entry point and will target different folders in _build and each builder will have its own translator to keep the pathways logically separate. This refactoring will greatly reduce the number of required options in conf.py as each builder is a specialised compilation pipeline. Emphasis will be on reducing requried configuration (for example we can enforce a theme that requires html and pdf templates to be in a specific location).
Discussion/Questions:
- Should we have
jupytercoverageorjupytertestfor coverage execution testing to supportjupyterhtmland/or as a standalone tool for reporting execution errors. - Jupyter is used as an intermediate format only as it provides an execution layer. One idea would be to rework
code-blockexecution at thesphinxtransform layer to alleviate the need to useJupyterin this way. However one benefit of using it as an intermediate layer is that we could supportJupytersources easily which is a big plus.
The primary use case that the jupyter builder should be able to support include constructing readeable notebooks with an emphasis on markdown inclusion. These can then be used for generating notebook sets for tutorials, lectures and courses.
| Option | Description |
|---|---|
| jupyter_conversion_mode |
all or code
|
| jupyter_static_path | Specify static file path |
| jupyter_language | Specify default programming language (python3) |
| jupyter_language_synonyms | parsing blocks for python3 and ipython |
| jupyter_solution_notebooks | Build solution notebooks that include tagged solutions for code-blocks |
or
jupyter = {
static_path : <path for static folder>,
conversion_mode : 'all',
language : 'python3'
language_synonyms: ['ipython'],
}Notes:
-
jupyter_static_pathwon't be needed if we build a library of static assets (see discussion below) - Remove
jupyter_header_block? - Remove
jupyter_languageand infer? Perhaps combine withjupyter_kernel = python3 -
jupyter_course_solutionsis different to current implementation to drop solutions as that approach requires two runs ofsphinxto be desired collections.
Should general options be specified at this level such as author:
| Option | Description |
|---|---|
| jupyter_author |
Consider A: Lecture/Course Support
Redesign course / lecture support to be more specialised and useful and perhaps include as a separate builder jupyterlectures. Options such as jupyter_drop_solutions would not be required, if solutions class is found then it would generate two sets of notebooks 1. base set for lecture, and 2. a solutions set which includes solution cells.
The primary use case that the jupyterhtml builder should be able to support generation of websites (using Jupyter) as an intermediate format.
| Option | Description |
|---|---|
| jupyterhtml_template | Specify conversion template |
| jupyterhtml_downloadnb | Generate download notebooks |
| jupyterhtml_downloadnb_urlpath | Specify online server for internet based assets |
or
jupyterhtml = {
theme : 'minimal',
download_notebooks : True/False,
download_notebooks_urlpath : <html path to server assets for images>,
}Notes:
- items like
jupyter_generate_html,jupyter_make_site, will not be required when it is a specialised build pathway. -
templatepaths not required if we enforce a theme structure forjupinxprojects
Update writers to use sphinx.util.docutils base class for Translators.
https://github.com/sphinx-doc/sphinx/blob/8c7faed6fcbc6b7d40f497698cb80fc10aee1ab3/sphinx/util/docutils.py#L429
We will have:
-
JupyterCodeTranslator-> code-only-
JupyterTranslator-> support for fullipynbrepresentation ofrst -
JupyterHTMLTranslator->html -
JupyterPDFTranslator->pdf
-
The primary use case that the jupyterpdf builder should be able to support generation of PDF files (using Jupyter) as an intermediate format. This includes individual PDF files of each RST File as well as a book style PDF of the whole project.
| Option | Description |
|---|---|
| jupyterpdf_template | Specify path to conversion template |
| jupyterpdf_bibfile | Specify path to bibfile location |
| jupyterpdf_author | |
| jupyterpdf_urlpath | Specify urlpath for external links for externally hosted content |
or
jupyterpdf = {
template : <path>,
bibfile : <path>,
urlpath : <path>,
}Notes:
- Should
jupyterpdf_authorbe handled asjupyter_author? - It is a bit strange to require usage of
themehere as none of the theme is useful except for thepdfconversion template. Perhaps thepdftemplate should be divorced from thehtmltheme? Or at the very least we should specify pdf template path rather than a theme.
Different Builders: (Accepted)
Refactoring into the different builders will alleviate the current confusion around setting options and their effect. Another option set we could consider is to use a pipeline option approach for collections of options. Related Issues https://github.com/QuantEcon/sphinxcontrib-jupyter/issues/199.
We will minimise code duplication across the different translators through inheritance from a base class (as all require code-block handling):
Translator Classes
`JupyterCode`
-> `JupyterNotebook`, `JupyterHTML`, `JupyterPDF`
Notebook Executor: (Develop)
We should write notebook execution as a utility that all classes can use to manage notebook execution in a consistent way. The utility will rely on dask[distributed]. We can then add cached notebook execution based on content changes. It would be nice if it can support:
- parsing output blocks for error handling and testing at the cellblock level.
Static Asset Managment: (Develop)
Management of static assets needs to be researched to see how we can integrate more closely with Sphinx internal management of ref and uri objects in the document tree. We should make use of sphinx as much as possible for managing these link types catering to:
- flat stuctures
- nested folder structures
- ability to have local
staticfolders for lecture series use case
Pandoc: (Consider)
Investigate greater use of pandoc directly for conversions? Is it useful to consider converting RST to MARKDOWN via pandoc as it may make IPYNB -> HTML conversion easier?
Installable Themes: (Consider)
If themes and templates were installable this would save on user configuration requirements in theme and/or templates. [Low Priority]
Execution:
Investigate transforms as an opportunity to develop an execution engine for each code-block. It might be useful to build DAG representation of the various target types (i.e. jupyter and coverage notebooks) to greatly reduce the amount of computation required across the various pipelines. If this was possible we could support code-block execution and then target html, latex and jupyter natively through their respective writers with additional code to handle interfaces to executed code-blocks.
http://docutils.sourceforge.net/docs/ref/transforms.html
Review collaborative opportunities with: https://jupyter-sphinx.readthedocs.io/en/latest/, https://github.com/jupyter/jupyter-sphinx