Hadoop - RealWorld-Yuchen-Yang/Notes GitHub Wiki

Installation (Mac)

  • brew install hadoop, installs hadoop
  • hadoop version, verifies installation

Configuration

  • brew installed configure file location: /usr/local/Cellar/hadoop/X.X.X/libexec/etc/hadoop

    • core-site.xml, common properties
    • hdfs-site.xml
    • mapred-site.xml
    • yarn-site.xml
  • customizing config files

    • declare config file location environment variable HADOOP_CONF_DIR, then export, or declare and export in .bash_profile
    • declare the above four config files

Format the HDFS before running Hadoop

hfs namenode -format

sbin directory: /usr/local/Cellar/hadoop/2.7.2/sbin, bin directory is already exported once brew-installed

(Note: running the following scripts are before logging into to the HDFS, also include sbin in the environment path is convenient)

  • start/stop services individually

    • start-dfs.sh
    • start-yarn.sh
    • mr-jobhistory-daemon.sh, mr-jobhistory-daemon.sh [start|stop] historyserver
    • stop-dfs.sh
    • stop-yarn.sh
  • start/stop services in groups by declaring alias in .bash_profile

    • start hadoop services, alias hstart="start-dfs.sh;start-yarn.sh;mr-jobhistory-daemon.sh start history server"
    • stop hadoop services, alias hstop="stop-yarn.sh;stop-dfs.sh;mr-jobhistory-daemon.sh stop history server"

Check running java jobs by using command: jps

Pseudo Distributed Mode

Note: either Web UI or log directory (in Hadoop installation directory /usr/local/Cellar/hadoop/2.7.2/libexec/logs) can be used to verify running status

Useful External Links