Jupyter Notes - fcrimins/fcrimins.github.io GitHub Wiki

Jupyter Kernels

StackOverflow: Choosing a Spark/Scala kernel for Jupyter/IPython (2/14/17)
I can't speak for all of them, but I use Spark Kernel and it works very well for using both Scala and Spark.
- Spark Kernel is now Apache Toree which possibly comes from IBM (i.e. red flag)
- jove-scala is no more. The GitHub page says use jupyter-scala instead
- Zeppelin looks pretty well developed. Zeppelin Notebook - big data analysis in Scala or Python in a notebook, and connection to a Spark cluster on EC2 is a nice explanation of installing it on AWS. Zeppelin is a JVM-based alternative to Jupyter.
- This IBM link says to use jupyter-scala
- Scala Notebook (last commit 2015; from Bridgewater) - An alternative to Jupyter.
- IScala (no commits since 2014)
- jupyter-scala - This looks like the one to use (last commit 1/17)
  - Installation:
    1. cd ~/bin
    2. curl -L -o coursier https://git.io/vgvpD && chmod +x coursier && ./coursier --help per here
    3. cd ~/code
    4. git clone https://github.com/alexarchambault/jupyter-scala.git
    5. cd jupyter-scala
    6. add addSbtPlugin("io.get-coursier" % "sbt-coursier" % "1.0.0-M15") to build.sbt per here
    7. ./jupyter-scala
    - Output: "Use this kernel from Jupyter notebook, running jupyter notebook and selecting the 'Scala' kernel."

6 points to compare Python and Scala for Data Science using Apache Spark

Python is more analytical oriented while Scala is more engineering oriented

Example Notebooks (12/6/16)

Python Data Science Handbook from O'Reilly by Jake VanderPlas

Run jupyter notebook inside ~/code/PythonDataScienceHandbook/notebooks/ to see the book
- Local version here: http://localhost:8888/notebooks/01.00-IPython-Beyond-Normal-Python.ipynb
First chapter is a nice background on Jupyter

Jupyter Notebook Tutorial: The Definitive Guide