DayDayUP_大数据学习课程[1]_hadoop2.6.0完全分布式集群环境和伪分布式集群搭建
2015-11-08 10:17
931 查看
1. 环境说明
系统 :Centos6.5软件版本: hadoop2.6.0 jdk1.8
集群状态:
master: www 192.168.78.110
slave1: node1 192.168.78.111
slave2: node2 192.168.78.112
hosts 文件
192.168.78.110 www
192.168.78.111 node1
192.168.78.112 node2
确保三台机器之间互ping 主机名能ping通
2. 下载 hadoop2.6.0 和jdk1.8
[root@www ~]# wget http://download.oracle.com/otn-pub/java/jdk/8u65-b17/jdk-8u65-linux-x64.rpm?AuthParam=1446899640_8da8d9b13f8bbe63b3bc0bc80b730f55 //下载后将.rpm后面的乱码去掉 [root@www ~]# wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.6.2/hadoop-2.6.2.tar.gz[/code]3. 配置java环境
3.1 安装jdk
# rpm -ivh jdk-8u45-linux-i586.rpm3.2 配置java环境变量
[root@www ~]# vimx /etc/profile #set java environment export JAVA_HOME=/usr/java/jdk1.8.0_45 //注意若下载了其他版本,注意变通 export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export PATH=$PATH:$JAVA_HOME/bin export JAVA_HOME CLASSPATH PATH [root@www ~]# source !$3.3 测试java环境
[root@www ~]# java -versionjava version "1.8.0_65" Java(TM) SE Runtime Environment (build 1.8.0_65-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)[root@www ~]# javac -versionjavac 1.8.0_654. 安装hadoop
4.1 解压安装
[root@www opt]# tar -xzvf hadoop-2.6.2.tar.gz [root@www opt]# mkdir /opt/hadoop [root@www src]# mv hadoop-2.6.2 /opt/hadoop [root@www src]# cd /opt/hadoop/hadoop-2.6.2 [root@www hadoop-2.6.2]# ls bin etc include lib libexec LICENSE.txt NOTICE.txt README.txt sbin share
4.2 添加hadoop用户[root@www hadoop-2.6.2]# useradd hadoop [root@www hadoop-2.6.2]# passwd hadoop [root@www hadoop-2.6.2]# chown -R hadoop:hadoop /opt/hadoop
4.3 修改hadoop配置文件[root@www hadoop-2.6.2]# su - hadoop //切换为hadoop用户 [hadoop@www ~]$ mkdir -p ~/hadoop/tmp ~/dfs/data ~/dfs/name //这些目录后期要用 [hadoop@www ~]$ ls dfs hadoop [hadoop@www ~]$ cd /opt/hadoop/hadoop-2.6.2/
4.3.1 配置 hadoop-env.sh文件–>修改JAVA_HOME[hadoop@www hadoop-2.6.2]$ vimx etc/hadoop/hadoop-env.sh # The java implementation to use. export JAVA_HOME=/usr/java/jdk1.8.0_65
4.3.2 配置 yarn-env.sh 文件–>>修改JAVA_HOME[hadoop@www hadoop-2.6.2]$ vimx etc/hadoop/yarn-env.sh # The java implementation to use. export JAVA_HOME=/usr/java/jdk1.8.0_65
4.3.3 配置slaves文件–>>增加slave节点[hadoop@www hadoop-2.6.2]$ vimx etc/hadoop/slaves node1 node2
4.3.4 配置 core-site.xml文件–>>增加hadoop核心配置(hdfs文件端口是9000、file:/home/hadoop/opt/hadoop-2.6.0/tmp、)[hadoop@www hadoop-2.6.2]$ vimx etc/hadoop/core-site.xml <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://www:9000</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file: /home/hadoop/hadoop/tmp</value> <description>Abasefor other temporary directories.</description> </property> <property> <name>hadoop.proxyuser.spark.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.spark.groups</name> <value>*</value> </property> </configuration>
4.3.5 配置 hdfs-site.xml 文件–>>增加hdfs配置信息(namenode、datanode端口和目录位置)[hadoop@www hadoop-2.6.2]$ vimx etc/hadoop/hdfs-site.xml <configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>www:9001</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/hadoop/dfs/data</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/hadoop/dfs/name</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <property> <name>dfs.namenode.checkpoint.dir</name> <value>file:///home/hadoop/hadoop/hdfs/namesecondary</value> </property> </configuration>
4.3.6 配置 mapred-site.xml 文件–>>增加mapreduce配置(使用yarn框架、jobhistory使用地址以及web地址)[hadoop@www hadoop-2.6.2]$ cp etc/hadoop/mapred-site.xml.template etc/hadoop/mapred-site.xml [hadoop@www hadoop-2.6.2]$ vimx etc/hadoop/mapred-site.xml <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>www:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>www:19888</value> </property> <property> <name>mapreduce.jobtracker.staging.root.dir</name> <value>/home/hadoop/hadoop</value> </property> </configuration>
4.3.7 配置 yarn-site.xml 文件–>>增加yarn功能[hadoop@www hadoop-2.6.2]$ vimx etc/hadoop/yarn-site.xml <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>www:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>www:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>www:8035</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>www:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>www:8088</value> </property> </configuration>
4.3.8 将所有文件(hadoop2.6.0和hosts)复制到node1 和node2 上
4.4.1 设置ssh免密码登陆
在三台服务器上分别执行[hadoop@www ~]$ ssh-keygen -t rsa //直接回车不用设置密码 [hadoop@node2 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@192.168.78.110 [hadoop@node2 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@192.168.78.111 [hadoop@node2 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@192.168.78.112
4.4.2 测试ssh免密码登录
在三台服务器上分别执行[hadoop@node2 ~]$ ssh www [hadoop@node2 ~]$ ssh node1 [hadoop@node2 ~]$ ssh node2
验证hadoop
5.1 格式化namenode[hadoop@www hadoop-2.6.2]$ ./bin/hadoop namenode -format [hadoop@node1 hadoop-2.6.2]$ ./bin/hadoop namenode -format [hadoop@node2 hadoop-2.6.2]$ ./bin/hadoop namenode -format
5.2 启动hadoop
启动所有[hadoop@www hadoop-2.6.2]$ ./sbin/start-all.sh//任意一台执行即可
正确的进程情况
master:[hadoop@www hadoop-2.6.2]$ jps 7136 ResourceManager 6993 SecondaryNameNode 6819 NameNode 7399 Jps
slave:[hadoop@node1 hadoop-2.6.2]$ jps 3186 Jps 3064 NodeManager 2974 DataNode
6 运行wordcount程序
6.1 创建目录和文件[hadoop@node1 hadoop-2.6.2]$ mkdir input [hadoop@node1 hadoop-2.6.2]$ touch input/test.log [hadoop@node1 hadoop-2.6.2]$ echo "hello world hello hadoop" > input/test.log [hadoop@node1 hadoop-2.6.2]$ cat input/test.log hello world hello hadoop
6.2 在hdfs创建/input目录[hadoop@node1 hadoop-2.6.2]$ ./bin/hadoop fs -mkdir /input
6.3 将test.log文件copy到hdfs /input目录[hadoop@www hadoop-2.6.2]$ ./bin/hadoop fs -put input/ /
6.4 查看hdfs上是否有test.log文件[hadoop@www hadoop-2.6.2]$ ./bin/hadoop fs -ls /input15/11/08 17:59:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 1 items -rw-r--r-- 2 hadoop supergroup 25 2015-11-08 17:59 /input/test.log
6.5 执行wordcount程序[hadoop@www hadoop-2.6.2]$ ./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar wordcount /input /output
6.6 查看结果[hadoop@www hadoop-2.6.2]$ ./bin/hadoop fs -cat /output/part-r-0000015/11/08 18:07:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable hadoop 1 hello 2 world 1
7. 伪分布式集群环境的搭建
只需修改namenode的两个文件
7.1etc/hadoop/hdfs-site.xml[hadoop@www hadoop-2.6.2]$ vimx etc/hadoop/hdfs-site.xml <configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>www:9001</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:///home/hadoop/dfs/data</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:///home/hadoop/dfs/name</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <property> <name>dfs.namenode.checkpoint.dir</name> <value>file:///home/hadoop/hadoop/hdfs/namesecondary</value> </property> </configuration>
7.2 etc/hadoop/slaves[hadoop@www hadoop-2.6.2]$ vimx etc/hadoop/slaves
7.3 格式化namenode[hadoop@www hadoop-2.6.2]$ ./bin/hadoop namenode -format
7.4 启动[hadoop@www hadoop-2.6.2]$ ./sbin/start-all.sh
7.5 查看进程[hadoop@www hadoop-2.6.2]$ jps 4048 NameNode 4545 NodeManager 4130 DataNode 4459 ResourceManager 5469 Jps 4286 SecondaryNameNode
7.6 上传文件[hadoop@www hadoop-2.6.2]$ ./bin/hadoop fs -put input/ /
7.7 运行wordcount[hadoop@www hadoop-2.6.2]$ ./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar wordcount /input /output
7.8 查看执行结果[hadoop@www hadoop-2.6.2]$ ./bin/hadoop fs -cat /output/part-r-00000
相关文章推荐
- Linux SendMail 使用外部SMTP服务发送邮件
- Spark大数据处理:技术、应用与性能优化(全).pdf
- .NET 2.0 CookieContainer bug
- Colorful Rainbows(计算几何)
- 【Alpha】Daily Scrum Meeting第五次
- 【Alpha】Daily Scrum Meeting第五次
- AI 集群效果 之 (Tactical Surround)
- 模块度与Louvain社区发现算法
- [RabbitMQ] Connection failed
- AIX 系统编译Lzo和Lzop源码
- Cloud Design Pattern - Event Sourcing Pattern(事件溯源模式)
- HZAU-Training-11-4(for 2015th)
- iotop详解
- 连接池中的maxIdle,MaxActive,maxWait参数
- Daily Scrum 11.7
- 设置xcode中模版tableViewCell的textLabel和detailTextLabel背景颜色
- PyCharm --------JetBrains
- HDU 1239 Calling Extraterrestrial Intelligence Again
- UVa 442 Matrix Chain Multiplication(栈)
- hdu 2476 String painter(区间dp)