Configuring Hadoop on Ubuntu Machine - manojkumar3036/BigData-using-Hadoop GitHub Wiki

Open the bashrc file

sudo nano ~/.bashrc
// To edit -- append all those things at the end  (done for me)
export HADOOP_PREFIX="/home/hduser/hadoop-2.7.1/"
export PATH=$PATH:$HADOOP_PREFIX/bin
export PATH=$PATH:$HADOOP_PREFIX/sbin
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
export YARN_HOME=${HADOOP_PREFIX}
// To refresh the values
source ~/.bashrc

Add the Hadoop HDFS URI (Namenode and its port)

sudo nano /home/hduser/hadoop-2.7.1/etc/hadoop/core-site.xml

<!--add this property with the configuration tag(done for me)-->
 <property>
        <name>fs.defaultFS</name>
        <value>hdfs://192.168.56.123:8020</value>
        <final>true</final>
    </property>

Add the HDFS properties

sudo nano /home/hduser/hadoop-2.7.1/etc/hadoop/hdfs-site.xml
<!--add all those properties within the configuration tag-->
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>

    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:///home/hduser/hadoop-2.7.1/hadoop_data/dfs/name</value>
    </property>
  
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:///home/hduser/hadoop-2.7.1/hadoop_data/dfs/data</value>
    </property>
</configuration>

Specify the MapReduce framework as YARN

sudo nano /home/hduser/hadoop-2.7.1/etc/hadoop/mapred-site.xml
//add this configuration within this file

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

Specifying the YARN Properties

sudo nano /home/hduser/hadoop-2.7.1/etc/hadoop/yarn-site.xml
<!--add all those properties within the configuration tag-->
<configuration>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>192.168.56.123:8032</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>192.168.56.123:8030</value>
    </property>

    <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>192.168.56.123:8031</value>
    </property>

    <property>
        <name>yarn.resourcemanager.admin.address</name>
        <value>192.168.56.123:8033</value>
    </property>

    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>192.168.56.123:8088</value>
    </property>

    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

    <property>
        <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>

    <property>
        <name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
    </property>
</configuration>

Add the JAVA_HOME for Hadoop

//so first find the  whether java is installed or not and which version you have
echo $JAVA_HOME
sudo nano /home/hduser/hadoop-2.7.1/etc/hadoop/hadoop-env.sh

//add this
export JAVA_HOME=<value for JAVA_HOME variable>

Format the NameNode

hdfs namenode -format

Now start the services

Namenode			hadoop-daemon.sh start namenode
Datanode			hadoop-daemon.sh start datanode
Resourcemanager		        yarn-daemon.sh start resourcemanager
Nodemanager			yarn-daemon.sh start nodemanager
Job History Server	        mr-jobhistory-daemon.sh start historyserver

Check the version of Hadoop

hadoop version or hadoop -version

Configuring Hadoop on Ubuntu Machine - manojkumar3036/BigData-using-Hadoop GitHub Wiki

Open the bashrc file

Add the Hadoop HDFS URI (Namenode and its port)

Add the HDFS properties

Specify the MapReduce framework as YARN

Specifying the YARN Properties

Add the JAVA_HOME for Hadoop

Format the NameNode

Now start the services

Check the version of Hadoop

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️