Optimizing Jupyter Notebooks - BKJackson/BKJackson

Create high res retina plots in your notebook

From https://gist.github.com/minrk/3301035

# 1. magic for inline plot
# 2. magic to enable retina (high resolution) plots
# https://gist.github.com/minrk/3301035
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

Or set it up permanently in your config add the following line to your ipython_kernel_config.py, which for me is in ~/.ipython/profile_default/

c.InlineBackend.figure_format = 'retina'

If the file does not already exist, you can generate it with all settings commented out by entering ipython profile create at the command line.

Force jupyter to reload modules

%load_ext autoreload  
%autoreload 2

Only reload a particular module

%load_ext autoreload
%autoreload 1
%aimport mymodule

Enable large plots in jupyter lab with sidecar

Install:

pip install sidecar
jupyter labextension install @jupyter-widgets/jupyterlab-manager
jupyter labextension install @jupyter-widgets/jupyterlab-sidecar

Usage:

from sidecar import Sidecar
from ipywidgets import IntSlider

sc = Sidecar(title='Sidecar Output')
sl = IntSlider(description='Some slider')
with sc:
    display(sl)

Tutorials & Videos

Jupyter Notebooks and Production Data Science Workflows

JupyterHub

With JupyterHub you can create a multi-user Hub which spawns, manages, and proxies multiple instances of the single-user Jupyter notebook server.

Project Jupyter created JupyterHub to support many users. The Hub can offer notebook servers to a class of students, a corporate data science workgroup, a scientific research project, or a high performance computing group.

A starter docker image for JupyterHub gives a baseline deployment of JupyterHub using Docker. ref

JupyterHub also provides a REST API for administration of the Hub and its users.

Snippets

Papermill readme link

Execute Papermill via the Python API

import papermill as pm

pm.execute_notebook(
   'path/to/input.ipynb',
   'path/to/output.ipynb',
   parameters = dict(alpha=0.6, ratio=0.1)
)

Execute Papermill via CLI

papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1

Read parameter files from a YAML file with -f

papermill local/input.ipynb s3://bkt/output.ipynb -f parameters.yaml

Tracking notebook cell timing with Papermill

https://papermill.readthedocs.io/en/latest/extending-entry-points.html#ensuring-your-engine-is-found-by-papermill

Reading a README.md file with a Jupyter notebook

From a new, blank notebook, paste this in the first cell:

from IPython.display import display, Markdown

with open('README.md', 'r') as fh:
    content = fh.read()

display(Markdown(content))

Post-Save Hooks source

Creating the .py and .html files can be done simply and painlessly by editing the config file:

~/.ipython/profile_nbserver/ipython_notebook_config.py

and adding the following code:

### If you want to auto-save .html and .py versions of your notebook:
# modified from: https://github.com/ipython/ipython/issues/8009
import os
from subprocess import check_call

def post_save(model, os_path, contents_manager):
    """post-save hook for converting notebooks to .py scripts"""
    if model['type'] != 'notebook':
        return # only do this for notebooks
    d, fname = os.path.split(os_path)
    check_call(['ipython', 'nbconvert', '--to', 'script', fname], cwd=d)
    check_call(['ipython', 'nbconvert', '--to', 'html', fname], cwd=d)

c.FileContentsManager.post_save_hook = post_save

Now every save to a notebook updates identically-named .py and .html files. Add these in your commits and pull-requests, and you will gain the benefits from each of these file formats.

Jupyter Runner for multiple parameters and multiple sets of parameters (docs)

Notebook execution can happen in parallel with a fixed number of workers.
Note: Only compatible with Python 3.5.

pip install jupyter-runner

jupyter-run notebookA.ipynb notebookB.ipynb  

ENV_VAR=xxx jupyter-run notebook.ipynb

Jupyter Docker Stack

Jupyter Docker Stacks - Official docs - Jupyter Docker Stacks are a set of ready-to-run Docker images containing Jupyter applications and interactive computing tools.
Jupyter Data Science Stack + Docker in under 15 minutes

Databricks and Jupyter Notebooks

Databricks notebook deployment template

Articles

Examples

A gallery of interesting Jupyter notebooks

Publishing Python Notebooks

Making Publication Ready Python Notebooks
Hacking my way to a Jupyter notebook powered blog

Two steps for using notebooks effectively

Since notebooks are challenging objects for source control (e.g., diffs of the json are often not human-readable and merging is near impossible), we recommended not collaborating directly with others on Jupyter notebooks. There are two steps we recommend for using notebooks effectively:

Follow a naming convention that shows the owner and the order the analysis was done in. We use the format --.ipynb (e.g., 0.3-bull-visualize-distributions.ipynb).
Refactor the good parts. Don't write code to do the same task in multiple notebooks. If it's a data preprocessing task, put it in the pipeline at src/data/make_dataset.py and load data from data/interim. If it's useful utility code, refactor it to src.

Source: Cookiecutter Data Science

Optimizing Jupyter Notebooks - BKJackson/BKJackson_Wiki GitHub Wiki

Create high res retina plots in your notebook

Force jupyter to reload modules

Only reload a particular module

Enable large plots in jupyter lab with sidecar

Tutorials & Videos

JupyterHub

Snippets

Papermill readme link

Execute Papermill via the Python API

Execute Papermill via CLI

Read parameter files from a YAML file with -f

Tracking notebook cell timing with Papermill

Reading a README.md file with a Jupyter notebook

Post-Save Hooks source

Jupyter Runner for multiple parameters and multiple sets of parameters (docs)

Jupyter Docker Stack

Databricks and Jupyter Notebooks

Articles

Examples

Publishing Python Notebooks

Two steps for using notebooks effectively

⚠️ GitHub.com Fallback ⚠️

Optimizing Jupyter Notebooks - BKJackson/BKJackson_Wiki GitHub Wiki

Create high res retina plots in your notebook

Force jupyter to reload modules

Only reload a particular module

Enable large plots in jupyter lab with sidecar

Tutorials & Videos

JupyterHub

Snippets

Papermill readme link

Execute Papermill via the Python API

Execute Papermill via CLI

Read parameter files from a YAML file with -f

Tracking notebook cell timing with Papermill

Reading a README.md file with a Jupyter notebook

Post-Save Hooks source

Jupyter Runner for multiple parameters and multiple sets of parameters (docs)

Jupyter Docker Stack

Databricks and Jupyter Notebooks

Articles

Examples

Publishing Python Notebooks

Two steps for using notebooks effectively

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️