1 Prepare GCP and Install Hadoop - arwankhoiruddin/hadoopLab GitHub Wiki
Introduction
This is to help students on lab session of Real Time Analysis Course in Universiti Teknologi PETRONAS, Malaysia
Lab Goal
The goal of the lab is to learn and practice Hadoop installation. The hadoop version used is Hadoop 3.2.2, while the OS used is Ubuntu 18.04 LTS on Google Cloud Platform (GCP)
Prerequisite
Get GCP Ready
- Input your billing. If you are new, you will get $300 for free that you can use in GCP
- Create your first VM. Use the closest server to you (Singapore) so it will be fast
- Ensure that https access is chosen
- You can use any OS if you like. However, to make our lab easy (for easy debug), we will use similar OS for our lab, i.e. Ubuntu 18.04 LTS
Go to the VM via SSH
- You can use any SSH server (e.g. putty, terminal, etc)
- However, GCP provides a very convenient environment for accessing the VM. We can use just our browser! :D
Install Java
Check if java is installed by running this command
java -version
If you find this:
Command 'java' not found, but can be installed with:
apt install default-jre
apt install openjdk-11-jre-headless
apt install openjdk-8-jre-headless
Ask your administrator to install one of them.
Then install it using the following command
sudo apt-get install default-jre
Download and Install Hadoop
Run this in command
curl -o hadoop-3.2.2.tar.gz https://downloads.apache.org/hadoop/common/hadoop-3.2.2/hadoop-3.2.2.tar.gz
Now extract
tar xvfz hadoop-3.2.2.tar.gz
Set JAVA_HOME
Find where java is installed
which java
You may find it produce the following path
/usr/bin/java
Now set the JAVA_HOME in Hadoop setting to where java is installed
nano hadoop-3.2.2/etc/hadoop/hadoop-env.sh
Press CTRL+W to find this line export JAVA_HOME
. Remove the comment (#) in that line, then add the java path. So it will become like this
export JAVA_HOME=/usr
Notice that we don't include the /bin/java
path when we run which java
command
Check If Hadoop is installed
./bin/hadoop
If you can run it, then you have your Hadoop in a local (standalone) mode.
Congratulations!
This concludes our lab tonight
See you tomorrow :D