HDP, Zeppelin and Python3 - stanislawbartkowski/wikis GitHub Wiki
Problem
HDP Zeppelin does not come with Python 3 interpreter out of the box. Some steps are necessary to enable it.
https://zeppelin.apache.org/docs/0.6.2/interpreter/python.html
Install Python 3
Python 3
The first thing to do is to install Python3, it is not included in the base CentOS or RHEL. The installation should be conducted on all nodes participating in the cluster.
yum install python36
Verify.
python3
Python 3.6.8 (default, Aug 7 2019, 17:28:10)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
Install ML Python packages
Verify that pip is enabled.
python3 -m ensurepip
WARNING: Running pip install with root privileges is generally not a good idea. Try `__main__.py install --user` instead.
Requirement already satisfied: setuptools in /usr/lib/python3.6/site-packages
Requirement already satisfied: pip in /usr/lib/python3.6/site-packages
Upgrade pip
python3 -m pip install --upgrade pip
WARNING: Running pip install with root privileges is generally not a good idea. Try `__main__.py install --user` instead.
Collecting pip
Downloading https://files.pythonhosted.org/packages/30/db/9e38760b32e3e7f40cce46dd5fb107b8c73840df38f0046d8e6514e675a1/pip-19.2.3-py2.py3-none-any.whl (1.4MB)
100% |████████████████████████████████| 1.4MB 1.0MB/s
Installing collected packages: pip
Successfully installed pip-19.2.3
Install pakacges
python3 -m pip install numpy python3 -m pip install pandas python3 -m pip install scikit-learn python3 -m pip install matplotlib python3 -m pip install py4j
Verify
python3
Python 3.6.8 (default, Aug 7 2019, 17:28:10)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sklearn
>>> import matplotlib
Zeppelin
List existing interpreters
/usr/hdp/current/zeppelin-server/bin/install-interpreter.sh -l
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
........
alluxio Alluxio interpreter
angular HTML and AngularJS view rendering
beam Beam interpreter
bigquery BigQuery interpreter
cassandra Cassandra interpreter built with Scala 2.11
elasticsearch Elasticsearch interpreter
file HDFS file interpreter
flink Flink interpreter built with Scala 2.11
hbase Hbase interpreter
ignite Ignite interpreter built with Scala 2.11
jdbc Jdbc interpreter
kylin Kylin interpreter
lens Lens interpreter
livy Livy interpreter
md Markdown support
pig Pig interpreter
python Python interpreter
scio Scio interpreter
shell Shell command
Enable python interpreter
/usr/hdp/current/zeppelin-server/bin/install-interpreter.sh -n python
(be patient, can take several minutes)
...
Interpreter python installed under /usr/hdp/current/zeppelin-server/interpreter/python.
1. Restart Zeppelin
2. Create interpreter setting in 'Interpreter' menu on Zeppelin GUI
3. Then you can bind the interpreter on your note
Ambari, Zeppelin configuration
Ambari->Zeppelin->Configs
Parameter | Value | Example |
---|---|---|
zeppelin.interpreter.group.order | Add python group at the end | spark,angular,jdbc,livy,md,sh,python |
Restart Zeppelin from Ambari console.
Add Python 3 interpreter
Logon to Zeppelin as the user with admin authority and open Interpreter setting. Create a new interpreter.
Parameter | Value | Example |
---|---|---|
Interpreter name | Name of the interpreter | python3 |
Interpreter group | python | |
zeppelin.python | python3 |
Test
Create a simple python3 notebook.
from platform import python_version print (python_version())
3.6.8
Another simple plot test. https://matplotlib.org/tutorials/introductory/usage.html#sphx-glr-tutorials-introductory-usage-py
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 2, 100)
plt.plot(x, x, label='linear')
plt.plot(x, x**2, label='quadratic')
plt.plot(x, x**3, label='cubic')
plt.xlabel('x label')
plt.ylabel('y label')
plt.title("Simple Plot")
plt.legend()
plt.show()