How to: Jupyter Notebook - QMSKI/TransparentQMSKI GitHub Wiki

** PAGE UNDER CONSTRUCTION **

Basic

Advanced

What is Jupyter notebook?

Jupyter notebook is a web application used to create documents containing narrative, live code, and visualizations. It is one of the most used tools to create reproducible workflows.

Install and launch JupyterLab

The easiest way work with Jupyter notebooks is through Anaconda, a distribution of python and R:

  • Download the latest version of Anaconda (unless you have different requirements for your project)
  • Install Anaconda following the specifics for your operating system
  • Launch Anaconda:
    anaconda
    Anaconda includes various software and application. For the python environment, in the first row you can see (from right to left):
    • Spyder: A python integrated development environment (IDE), very similar to Matlab
    • IPython: First generation of computational notebooks
    • Jupyter notebook: Second generation of computational notebooks
    • JupyterLab: Third generation of computational notebooks
  • Launch JupyterLab

How does JupyterLab look like?

When you launch JupyterLab, you see something similar to this in your browser:
jupyter_lab1

Explore:

  • Top bar with file commands (green)
  • Lateral panel showing your directory. Navigate to your working directory in the blue square

Open your notebook by clicking on "Python 3" under "Notebook" (red)

Right click on the notebook name to change its name (red). Explore the other possible actions:
jupyter_lab2

Writing code and running cells

Creating the narrative

[description]
Examples of how to create great narratives are in A gallery of interesting Jupyter notebooks

Tips and tricks

  • Showing line number in cells: In the top bar, click View -> Show Line Numbers
  • Comment or uncomment lines: select lines + control + /
  • Indent lines:
    • In: select lines + tab
    • Out: select lines + shift + tab
  • Merge or split cells:
    • Merge: Select the cells you want to merge, then then in the top bar click Edit -> Merge selected cell
    • Split: Click where you want to split your cell, then in the top bar click Edit -> Split cell

Making your workflow reproducible: Automatic data download, data tidying, and dependences

When you release your Jupyter notebook, you want it to be fully reproducible: Another researcher has to be able to run the cells without downloading nor modifying data, and s/he has to be able to reproduce the computational environment you used. Some more inputs on how to make your notebook reproducible are here [1]

Automatic data download

Once you are ready to release your project, upload your data to Zenodo. In your notebook, add and adapt these lines:

import wget

zenodo_URL      = "https://zenodo.org/record/2583184/files/" # 2583184 is the last part of the Zenodo DOI
file_name       = "inHouse_segmented.zip" 
local_directory = "./"

wget.download(zenodo_URL + file_name, local_directory) 

Another researcher will be able to automatically download your data in his/her working directory.

Data tidying

Consider your notebook as a workflow that reads raw data as input and outputs derived data. In many cases, input raw data need some processing, e.g. column deletion, conversion of measurement units, etc. Data tidying has to be reproducible and thus should be performed in the notebook. When your input data is in a tabular form (e.g. .xlsx, .csv), you can use the python library pandas to import and tidy the data. For example:

import pandas  as pd

# read the tabular file
dataframe = pd.read_csv(file_name.csv) 

# get only the column "subject_ID" 
data = dataframe.loc[:, ["subject_ID"]]

If you tabular data is in a Google Sheet file, here is how to import it in the notebook

Dependences

To make code reproducible, it is important to specify dependences. Versions of packages can change over time and functions can work differently after major updates.

Dependences are commonly declared at the end of a notebook. The magic extension watermark prints out versions of python, ipython, and packages, and hardware information:

jupyter_dependences

Note that packages are listed separated by comma and without space in between

Increasing interactivity with widgets

Widgets are elements commonly found in graphical user interfaces (GUIs), such as radio buttons, dropdown menus, sliders, etc. A great introduction with code examples is here.

To install widgets, go to terminal and type:

pip install ipywidgets
jupyter nbextension enable --py widgetsnbextension
jupyter labextension install @jupyter-widgets/jupyterlab-manager  

Restart JupyterLab if it the visualization does not occur

For medical imaging, an interesting widget is itkwidget. It allows 3D visualization with various colormaps. To install itkwidget, go to terminal and type:

pip install itkwidgets
jupyter labextension install @jupyter-widgets/jupyterlab-manager itk-jupyter-widgets

Restart JupyterLab if it the visualization does not occur

Mixing python and R

Extensions

Interesting links:

Slides for presentations

Upgrading to the latest version

In terminal, type:

conda install -c conda-forge jupyterlab

If using ipywidgets:

  • In terminal type:
    jupyter labextension install @jupyter-widgets/jupyterlab-manager
    
  • Upgrade for version compatibility with the appropriate command

References

Rules et al. Ten Simple Rules for Reproducible Research in Jupyter Notebooks. 2018

⚠️ **GitHub.com Fallback** ⚠️