Prepare the environment to install and use hadoop 3.2.3 - alkemyTech/OT172-python GitHub Wiki
hadoop supports versions of java 8 and 11, but on the official page it indicates that version hadoop 3.3.3 supports java 11 but even so they ask to compile it with java 8 link
sudo apt install openjdk-8-jdk
sudo update-alternatives --config java
and select an option
wget https://downloads.apache.org/hadoop/common/hadoop-3.2.3/hadoop-3.2.3.tar.gz
or direct download in the official page hadoop download
tar -xzvf hadoop-3.2.3.tar.gz
sudo vim .bashrc
#hadoop related options
export HADOOP_HOME=/home/hdoop/hadoop-3.2.3
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/nativ"
Save and execute
source ~/.bashrc
sudo vim $HADOOP_HOME/etc/hadoop/hadoop-env.sh
If file not exists, then change the hadoop version in the file .bashrc -> line export HADOOP_HOME=/home/hdoop/hadoop-x.x.x
if you have your java version in another path, run this in terminal:
which javac | readlink -f /usr/bin/javac
sudo vim $HADOOP_HOME/etc/hadoop/core-site.xml
<property>
<name> hadoop.tmp.dir </name>
<value> /home/hdoop/tmpdata </value>
</property>
<property>
<name> fs.default.name </name>
<value> hdfs://127.0.0.1:9000 </value>
</property>
sudo vim $HADOOP_HOME/etc/hadoop/hdfs-site.xml
<property>
<name> dfs.data.dir </name>
<value> /home/hdoop/dfsdata/namenode </value>
</property>
<property>
<name> dfs.data.dir </name>
<value> /home/hdoop/dfsdata/datanode </value>
</property>
<property>
<name> dfs.replication </name>
<value> 1 </value>
</property>
sudo vim $HADOOP_HOME/etc/hadoop/mapred-site.xml
<property>
<name> mapreduce.framework.name </name>
<value> yarn </value>
</property>
sudo vim $HADOOP_HOME/etc/hadoop/yarn-site.xml
delete the line between tags
<property>
<name> yarn.nodemanager.aux-services </name>
<value> mapreduce_shuffle </value>
</property>
<property>
<name> yarn.nodemanager.aux-services.mapreduce.shuffle.class </name>
<value> org.apache.hadoop.mapred.ShuffleHandler </value>
</property>
<property>
<name> yarn.resourcemanager.hostname </name>
<value> 127.0.0.1 </value>
</property>
<property>
<name> yarn.acl.enable </name>
<value> 0 </value>
</property>
<property>
<name> yarn.nodemanager.env-whitelist </name>
<value> JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PERPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME </value>
</property>
hdfs namenode -format
cd hadoop-3.2.3/sbin/
./start-dfs.sh
./start-yarn.sh
jps
http://localhost:9870
http://localhost:9864
http://localhost:8088
reference link: https://www.youtube.com/watch?v=BHF3rtylfPQ