您的位置:首页 > 运维架构 > Linux

centos7搭建hadoop2.7.2完全分布式集群

2016-08-02 14:09 751 查看

centos7搭建hadoop2.7.2完全分布式集群

我之前使用的是centos6.8安装hadoop2.7.2,但报错如下:

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable.是由于缺少hadoop-native-64-2.7.0.tar,但结果还是报错,于是换了centos7.2来安装,不过又入坑了,请看cetos7初体验

创建目录 /usr/apache 来放置hadoop系列软件,方便管理。

jdk安装:

官网下载jdk1.8(hadoop2.7对idk的要求是jdk1.7以上,为了避免出错,我使用最新的jdk版本)。解压并移动到 /usr/apache 目录。配置环境变量:

vi /etc/profile

加入以下内容:

#java


export JAVA_HOME=/usr/apache/jdk1.8.0_101


export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib


export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin


然后 source /etc/profile,再使用java -version查看java是否安装完成。

ssh免密码配置

ssh的免密码配置请参考http://my.oschina.net/u/189445/blog/503525

可能会报错:-bash: ssh: command not found

解决方法:centos最小化安装会出现的问题.

解决方法:

yum -y install openssh-clients

hadoop安装

环境变量的设置:

vi /etc/profile

#hadoop


export HADOOP_HOME=/usr/apache/hadoop-2.7.2


export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin


hadoop配置文件的配置

hadoop2.x的配置文件放在 hadoop-2.7.2/etc/hadoop/ 下:

配置hadoop-env.sh与yarn-env.sh

# The java implementation to use.


export JAVA_HOME=/usr/apache/jdk1.8.0_101


export HADOOP_CONF_DIR=/usr/apache/hadoop-2.7.2/etc/hadoop/


最后的HADOOPCONFDIR中的/一定要加上,不然会报错:

master: Error: Cannot find configuration directory: /etc/hadoop

其中yarn-env.sh只加入java的环境变量就行了。

core-site.sh配置

<configuration>


<property>


<name>fs.defaultFS</name>


<value>hdfs://master:9000</value>


</property>


<property>


<name>hadoop.tmp.dir</name>


<value>file:/usr/apache/hadoop-2.7.2/tmp</value>


</property>


<property>


<name>io.file.buffer.size</name>


<value>131702</value>


</property>


</configuration>


hdfs.site.sh配置

<configuration>


<property>


<name>dfs.namenode.name.dir</name>


<value>file:/usr/apache/hadoop-2.7.2/dfs/name</value>


</property>


<property>


<name>dfs.datanode.data.dir</name>


<value>file:/usr/apache/hadoop-2.7.2/dfs/data</value>


</property>


<property>


<name>dfs.replication</name>


<value>1</value>


</property>


<property>


<name>dfs.namenode.secondary.http-address</name>


<value>master:9001</value>


</property>


<property>


<name>dfs.webhdfs.enabled</name>


<value>true</value>


</property>


</configuration>


mapred-site.xml配置,需要从mapred-site.xml.template复制一份

<configuration>


<property>


<name>mapreduce.framework.name</name>


<value>yarn</value>


</property>


<property>


<name>mapreduce.jobhistory.address</name>


<value>master:10020</value>


</property>


<property>


<name>mapreduce.jobhistory.webapp.address</name>


<value>master:19888</value>


</property>


</configuration>


yarn-site.xml配置

<configuration>


<!-- Site specific YARN configuration properties -->


<property>


<name>yarn.nodemanager.aux-services</name>


<value>mapreduce_shuffle</value>


</property>


<property>


<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>


<value>org.apache.hadoop.mapred.ShuffleHandler</value>


</property>


<property>


<name>yarn.resourcemanager.address</name>


<value>master:8032</value>


</property>


<property>


<name>yarn.resourcemanager.scheduler.address</name>


<value>master:8030</value>


</property>


<property>


<name>yarn.resourcemanager.resource-tracker.address</name>


<value>master:8031</value>


</property>


<property>


<name>yarn.resourcemanager.admin.address</name>


<value>master:8033</value>


</property>


<property>


<name>yarn.resourcemanager.webapp.address</name>


<value>master:8088</value>


</property>


<property>


<name>yarn.nodemanager.resource.memory-mb</name>


<value>768</value>


</property>


</configuration>


格式化namenode

使用的命令是 hdfs namenode -format ,该命令在hadoop2.7.2/bin下:

INFO common.Storage: Storage directory /usr/apache/hadoop-2.7.2/dfs/name has been successfully formatted.


INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0


INFO util.ExitUtil: Exiting with status 0


上面的反馈表明格式化成功。

启动hdfs

启动命令在hadoop2.7.2/sbin下:

先启动dos:start-dfs.sh

master: starting namenode, logging to /usr/apache/hadoop-2.7.2/logs/hadoop-root-namenode-master.out


slave1: starting datanode, logging to /usr/apache/hadoop-2.7.2/logs
4000
/hadoop-root-datanode-slave1.out


slave2: starting datanode, logging to /usr/apache/hadoop-2.7.2/logs/hadoop-root-datanode-slave2.out


Starting secondary namenodes [master]


master: starting secondarynamenode, logging to /usr/apache/hadoop-2.7.2/logs/hadoop-root-secondarynamenode-master.out


启动yarn:start-yarn.sh

starting yarn daemons


starting resourcemanager, logging to /usr/apache/hadoop-2.7.2/logs/yarn-root-resourcemanager-master.out


slave1: starting nodemanager, logging to /usr/apache/hadoop-2.7.2/logs/yarn-root-nodemanager-slave1.out


slave2: starting nodemanager, logging to /usr/apache/hadoop-2.7.2/logs/yarn-root-nodemanager-slave2.out


jps命令查看各节点进程:

master上:

3458 ResourceManager

3299 SecondaryNameNode

3527 Jps

3115 NameNode

slave1上:

2852 Jps

2646 DataNode

slave2上:

9620 Jps

9414 DataNode

到此,hadoop集群搭建完成。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  hadoop