How to: Jupyter Notebook - QMSKI/TransparentQMSKI GitHub Wiki
** PAGE UNDER CONSTRUCTION **
Basic
- What is Jupyter notebook?
- Install and launch JupyterLab
- How does JupyterLab look like?
- Writing code and running cells
- Creating the narrative
- Tips and tricks
Advanced
- Making your workflow reproducible: Automatic data download, data tidying, and dependences
- Increasing interactivity with widgets
- Mixing python and R
- Extensions
- Slides for presentations
- Upgrading to the latest version
Jupyter notebook is a web application used to create documents containing narrative, live code, and visualizations. It is one of the most used tools to create reproducible workflows.
The easiest way work with Jupyter notebooks is through Anaconda, a distribution of python and R:
- Download the latest version of Anaconda (unless you have different requirements for your project)
- Install Anaconda following the specifics for your operating system
- Launch Anaconda:
Anaconda includes various software and application. For the python environment, in the first row you can see (from right to left):- Spyder: A python integrated development environment (IDE), very similar to Matlab
- IPython: First generation of computational notebooks
- Jupyter notebook: Second generation of computational notebooks
- JupyterLab: Third generation of computational notebooks
- Launch JupyterLab
When you launch JupyterLab, you see something similar to this in your browser:
Explore:
- Top bar with file commands (green)
- Lateral panel showing your directory. Navigate to your working directory in the blue square
Open your notebook by clicking on "Python 3" under "Notebook" (red)
Right click on the notebook name to change its name (red). Explore the other possible actions:
[description]
Examples of how to create great narratives are in A gallery of interesting Jupyter notebooks
- Showing line number in cells: In the top bar, click
View
->Show Line Numbers
- Comment or uncomment lines: select lines +
control
+/
- Indent lines:
- In: select lines +
tab
- Out: select lines +
shift
+tab
- In: select lines +
- Merge or split cells:
- Merge: Select the cells you want to merge, then then in the top bar click
Edit
->Merge selected cell
- Split: Click where you want to split your cell, then in the top bar click
Edit
->Split cell
- Merge: Select the cells you want to merge, then then in the top bar click
When you release your Jupyter notebook, you want it to be fully reproducible: Another researcher has to be able to run the cells without downloading nor modifying data, and s/he has to be able to reproduce the computational environment you used. Some more inputs on how to make your notebook reproducible are here [1]
Once you are ready to release your project, upload your data to Zenodo. In your notebook, add and adapt these lines:
import wget
zenodo_URL = "https://zenodo.org/record/2583184/files/" # 2583184 is the last part of the Zenodo DOI
file_name = "inHouse_segmented.zip"
local_directory = "./"
wget.download(zenodo_URL + file_name, local_directory)
Another researcher will be able to automatically download your data in his/her working directory.
Consider your notebook as a workflow that reads raw data as input and outputs derived data. In many cases, input raw data need some processing, e.g. column deletion, conversion of measurement units, etc. Data tidying has to be reproducible and thus should be performed in the notebook. When your input data is in a tabular form (e.g. .xlsx, .csv), you can use the python library pandas to import and tidy the data. For example:
import pandas as pd
# read the tabular file
dataframe = pd.read_csv(file_name.csv)
# get only the column "subject_ID"
data = dataframe.loc[:, ["subject_ID"]]
If you tabular data is in a Google Sheet file, here is how to import it in the notebook
To make code reproducible, it is important to specify dependences. Versions of packages can change over time and functions can work differently after major updates.
Dependences are commonly declared at the end of a notebook. The magic extension watermark
prints out versions of python, ipython, and packages, and hardware information:
Note that packages are listed separated by comma and without space in between
Widgets are elements commonly found in graphical user interfaces (GUIs), such as radio buttons, dropdown menus, sliders, etc. A great introduction with code examples is here.
To install widgets, go to terminal and type:
pip install ipywidgets
jupyter nbextension enable --py widgetsnbextension
jupyter labextension install @jupyter-widgets/jupyterlab-manager
Restart JupyterLab if it the visualization does not occur
For medical imaging, an interesting widget is itkwidget. It allows 3D visualization with various colormaps. To install itkwidget, go to terminal and type:
pip install itkwidgets
jupyter labextension install @jupyter-widgets/jupyterlab-manager itk-jupyter-widgets
Restart JupyterLab if it the visualization does not occur
Interesting links:
- Collection of Jupyter extensions
- Summary of Jupyter extensions
In terminal, type:
conda install -c conda-forge jupyterlab
If using ipywidgets:
- In terminal type:
jupyter labextension install @jupyter-widgets/jupyterlab-manager
- Upgrade for version compatibility with the appropriate command
Rules et al. Ten Simple Rules for Reproducible Research in Jupyter Notebooks. 2018