hadoop安装及开发指南 - artinfo1982/demos GitHub Wiki
在 http://hadoop.apache.org/releases.html 下载最新版的hadoop,下文以3.0.0为例。
将下载的 hadoop-3.0.0.tar.gz 上传到Linux任意路径下(创建用户省略,根据需要自行处理),解压。
修改 ~/hadoop-3.0.0/etc/hadoop/core-site.xml:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<!--根据实际修改-->
<value>hdfs://192.168.1.129:9000</value>
</property>
</configuration>
修改 ~/hadoop-3.0.0/etc/hadoop/hdfs-site.xml:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<!--副本数,根据需要修改-->
<value>1</value>
</property>
</configuration>
在 ~/hadoop-3.0.0/etc/hadoop/hadoop-env.sh 中添加:
export JAVA_HOME="/home/zxh/jdk1.8.0_162"
在 ~/.bashrc 中添加如下环境变量(值根据实际修改):
export JAVA_HOME=/home/zxh/jdk1.8.0_162
export PATH=${JAVA_HOME}/bin:${PATH}
export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH
export HADOOP_HOME=/home/zxh/hadoop-3.0.0
export HADOOP_CLASSPATH=$JAVA_HOME/lib/tools.jar
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
source .bashrc生效
格式化HDFS:
~/hadoop-3.0.0/bin/hdfs namenode -format
启动namenode、datanode:
~/hadoop-3.0.0/sbin/start-dfs.sh
启动完成后,检查java进程:
zxh@ubuntu:~$ ps -ef | grep java
zxh 5368 1 0 17:41 ? 00:00:32 /home/zxh/jdk1.8.0_162/bin/java -Dproc_namenode -Djava.net.preferIPv4Stack=true -Dhdfs.audit.logger=INFO,NullAppender -Dhadoop.security.logger=INFO,RFAS -Dyarn.log.dir=/home/zxh/hadoop-3.0.0/logs -Dyarn.log.file=hadoop-zxh-namenode-ubuntu.log -Dyarn.home.dir=/home/zxh/hadoop-3.0.0 -Dyarn.root.logger=INFO,console -Djava.library.path=/home/zxh/hadoop-3.0.0/lib/native -Dhadoop.log.dir=/home/zxh/hadoop-3.0.0/logs -Dhadoop.log.file=hadoop-zxh-namenode-ubuntu.log -Dhadoop.home.dir=/home/zxh/hadoop-3.0.0 -Dhadoop.id.str=zxh -Dhadoop.root.logger=INFO,RFA -Dhadoop.policy.file=hadoop-policy.xml org.apache.hadoop.hdfs.server.namenode.NameNode
zxh 5504 1 0 17:41 ? 00:00:35 /home/zxh/jdk1.8.0_162/bin/java -Dproc_datanode -Djava.net.preferIPv4Stack=true -Dhadoop.security.logger=ERROR,RFAS -Dyarn.log.dir=/home/zxh/hadoop-3.0.0/logs -Dyarn.log.file=hadoop-zxh-datanode-ubuntu.log -Dyarn.home.dir=/home/zxh/hadoop-3.0.0 -Dyarn.root.logger=INFO,console -Djava.library.path=/home/zxh/hadoop-3.0.0/lib/native -Dhadoop.log.dir=/home/zxh/hadoop-3.0.0/logs -Dhadoop.log.file=hadoop-zxh-datanode-ubuntu.log -Dhadoop.home.dir=/home/zxh/hadoop-3.0.0 -Dhadoop.id.str=zxh -Dhadoop.root.logger=INFO,RFA -Dhadoop.policy.file=hadoop-policy.xml org.apache.hadoop.hdfs.server.datanode.DataNode
zxh 5727 1 0 17:41 ? 00:00:17 /home/zxh/jdk1.8.0_162/bin/java -Dproc_secondarynamenode -Djava.net.preferIPv4Stack=true -Dhdfs.audit.logger=INFO,NullAppender -Dhadoop.security.logger=INFO,RFAS -Dyarn.log.dir=/home/zxh/hadoop-3.0.0/logs -Dyarn.log.file=hadoop-zxh-secondarynamenode-ubuntu.log -Dyarn.home.dir=/home/zxh/hadoop-3.0.0 -Dyarn.root.logger=INFO,console -Djava.library.path=/home/zxh/hadoop-3.0.0/lib/native -Dhadoop.log.dir=/home/zxh/hadoop-3.0.0/logs -Dhadoop.log.file=hadoop-zxh-secondarynamenode-ubuntu.log -Dhadoop.home.dir=/home/zxh/hadoop-3.0.0 -Dhadoop.id.str=zxh -Dhadoop.root.logger=INFO,RFA -Dhadoop.policy.file=hadoop-policy.xml org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode
检查监听端口:
zxh@ubuntu:~$ netstat -anop | grep java | grep LISTEN
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:9864 0.0.0.0:* LISTEN 5504/java off (0.00/0/0)
tcp 0 0 192.168.1.129:9000 0.0.0.0:* LISTEN 5368/java off (0.00/0/0)
tcp 0 0 0.0.0.0:9866 0.0.0.0:* LISTEN 5504/java off (0.00/0/0)
tcp 0 0 0.0.0.0:9867 0.0.0.0:* LISTEN 5504/java off (0.00/0/0)
tcp 0 0 0.0.0.0:9868 0.0.0.0:* LISTEN 5727/java off (0.00/0/0)
tcp 0 0 0.0.0.0:9870 0.0.0.0:* LISTEN 5368/java off (0.00/0/0)
tcp 0 0 127.0.0.1:19061 0.0.0.0:* LISTEN 5504/java off (0.00/0/0)
打开浏览器,输入 http://192.168.1.129:9870
可以查看namenode、datanode的状态。
后续补充