Hadoop Installation - Nantawat6510545543/big-data-summary GitHub Wiki
sudo apt update
sudo apt install openjdk-8-jdk
sudo adduser hadoop
sudo usermod -aG sudo hadoop
sudo passwd hadoop
Login as the hadoop
user:
ssh hadoop@<your-vm-ip>
Generate and configure key-based SSH:
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys
wget https://archive.apache.org/dist/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz
tar xzf hadoop-3.2.1.tar.gz
mv hadoop-3.2.1 hadoop
Edit .bashrc
:
nano ~/.bashrc
Add to the end:
export HADOOP_HOME=/home/hadoop/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
Apply the changes:
source ~/.bashrc
Edit hadoop-env.sh
:
nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh
Add:
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
nano $HADOOP_HOME/etc/hadoop/core-site.xml
Paste on configuration:
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml
Paste on configuration:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>
</property>
nano $HADOOP_HOME/etc/hadoop/mapred-site.xml
Paste on configuration:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
nano $HADOOP_HOME/etc/hadoop/yarn-site.xml
Paste on configuration:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
hdfs namenode -format
cd $HADOOP_HOME/sbin/
./start-all.sh
jps
- NameNode:
http://<yourip>:9870
- ResourceManager:
http://<yourip>:8088