您的位置:首页 > 运维架构 > Linux

Centos6.5中编译hadoop2.x 并安装 运行wordCount

2018-03-06 17:10 579 查看
下载安装包:

http://mirror.bit.edu.cn/apache/maven/maven-3/3.0.5/binaries/

解压:

tar -zxvf apache-maven-3.0.5-bin.tar.gz

设置环境变量

export MAVEN_HOME=/app/lib/apache-maven-3.0.5
export PATH=$PATH:$MAVEN_HOME/bin


安装svn

yun install svn

安装libtool cmake

yum install autoconf automake libtool cmake

安装ncurses-devel

yum install ncurses-devel

安装openssl-devel

yum install openssl-devel

安装gcc

yum install gcc*

安装并设置protobuf

下载 http://pan.baidu.com/s/1pJlZubT

./configure
make
make check
make install

//验证
protoc


下载hadoop源包

svn checkout http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0

编译包

mvn package -Pdist,native -DskipTests -Dtar

file ./libhadoop.so.1.0.0

验证是否为64位

安装hadoop 2.x

解压安装包
mkdir tmp
mkdir hdfs
mkdir hdfs/name
mkdir hdfs/data


配置hadoop-env.sh

export HADOOP_CONF_DIR=/app/hadoop-2.2.0/etc/hadoop
export HADOOP_HOME=/app/hadoop-2.2.0
export PATH=$PATH:$HADOOP_HOME/bin
export JAVA_HOME=/app/lib/jdk-8


配置/etc/profile

export HADOOP_HOME=/app/hadoop-2.2.0
export PATH=$PATH:$HADOOP_HOME/bin


配置yarn-env.sh

export JAVA_HOME=/app/lib/jdk-8


配置core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop:9000</value>
</property>
<property>
<name>fs.defaultFS</name>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/app/hadoop-2.2.0/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>hadoop.proxyuser.hduser.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hduser.groups</name>
<value>*</value>
</property>
</configuration>


配置hdfs-site.xml

<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/app/hadoop-2.2.0/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/app/hadoop-2.2.0/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>


配置mapred-site.xml

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop:19888</value>
</property>
</configuration>


配置yarn-site.xml

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop:8088</value>
</property>
</configuration>


设置slaves

vim slaves
hadoop


设置slaves

./hadoop namenode -format
./start-dfs.sh
./start-yarn.sh
jps:
namenode
datanode
secondarynamenode
resourcemanager
nodemanager


测试wordCount

hdfs中创建目录

hadoop fs -mkdir -p /class3/input

准备测试数据

hadoop fs -copyFromLocal ../etc/hadoop/* /class3/input

运行wordCount

hadoop jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /class3/input /class3/output

查看运行结果

hadoop fs -cat /class3/output/part-r-00000 | less
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: