Day 1 - QCB-Collaboratory/W17.MachineLearning GitHub Wiki

Introduction, Python review and Jupyter notebooks

Slides are available here

The video is available here

Class materials

Here is a (static) Jupyter Notebook with all commands from the first day.
For a LIVE and editable version of Jupyter Notebook:

The live version above does not require accounts or virtually anything installed on your own computer. It usually takes a few minutes (on average 3 min., certainly less than 10) for the notebook to be ready. But once it's on your screen, it runs smoothly.

Exercise with Random Trees

In-class practice for Decision Trees and Random Forests.
Click here to download the data used in the Decision Tree practice.

Breast cancer dataset

Breast Cancer Wisconsin (Diagnostic) Data Set available on UCI Machine Learning Database.
Original paper that published this dataset.

Extra comments

You can find here the video from slide 82, published by Mack et al. in Nature Communications.

How to install the R kernel for Jupyter?

In order to make Jupyter work with the language R, you need to install the R kernel. A kernel will be the interface between the Notebook and the language.

The kernel for R is called IRKernel. If you are using Anaconda distribution, you can install it directly by following directly this link. If you do not use Anaconda, then you need to install it directly from IRKernel's website.

Great examples of Jupyter Notebooks

You will find below a list of great examples of notebooks to use as inspiration for your own work. Because all of these notebooks are publicly available, you can download them and open locally to examine them. If you want even more notebooks, check out this gallery of notebooks provided by the Jupyter project.

Genomics and NGA