您的位置:首页 > 运维架构

3节点hadoop2.2.0集群安装

2013-10-24 09:41 543 查看

机器规划

 

Hostname

IP

作用

Hadoop01

192.168.20.38

Master,namenode

Hadoop02

192.168.20.39

Slave,datanode

Hadoop03

192.168.20.40

Slave,datanode

基本信息

 

 

Hadoop版本

2.2.0

机器

Oracle Linux 6.3 64位

 

安装过程(每个节点一样)

1Host
[root@hadoop02 ~]# vi /etc/hosts

192.168.20.38   hadoop01

192.168.20.39   hadoop02

192.168.20.40   hadoop03

2.hadoop用户
[root@hadoop01 ~]# useradd hadoop

[root@hadoop01 ~]# passwd hadoop

3. JAVA安装
#rpm -ivh jdk-7u45-linux-x64.rpm

4.软件安装
解压即可,放在了/hadoop/hadoop-2.2.0

5.环境变量
$ vi .bash_profile

export JAVA_HOME=/usr/java/jdk1.7.0_45

export HADOOP_HOME=/hadoop/hadoop-2.2.0

export JRE_HOME=$JAVA_HOME/jre

exportPATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

export CLASSPATH=./:$JAVA_HOME/lib:$JAVA_HOME/jre/lib

 

6.目录:
[hadoop@hadoop01 hadoop]$ mkdir -p/hadoop/hdfs/name

[hadoop@hadoop01 hadoop]$ mkdir -p/hadoop/hdfs/data

[hadoop@hadoop01 hadoop]$ mkdir -p/hadoop/hdfs/tmp

 

配置hadoop(每个节点一样)

下列文件每个节点都需要配置,配置的位置:

[hadoop@hadoop02 hadoop]$ pwd

/hadoop/hadoop-2.2.0/etc/hadoop

Core-site.xml
[hadoop@hadoop01 hadoop]$ vi core-site.xml

<configuration> 

    <property> 

                <name>fs.defaultFS</name> 

               <value>hdfs://hadoop01:9000/</value> 

                <description>The name ofthe default file system. A URI whose scheme and authority determine theFileSystem implementation. The uri's scheme determines the config property(fs.SCHEME.impl) naming the FileSystem implementation class. The uri'sauthority
is used to determine the host, port, etc. for afilesystem.</description> 

       </property> 

       <property> 

               <name>dfs.replication</name>  

               <value>3</value> 

       </property> 

       <property> 

               <name>hadoop.tmp.dir</name> 

               <value>/tmp/hadoop-${user.name}</value> 

               <description></description> 

       </property> 

</configuration>

 

Hdfs-site.xml
[hadoop@hadoop01 hadoop]$ vi hdfs-site.xml

 

<configuration> 

    <property> 

               <name>dfs.namenode.name.dir</name> 

                <value>/hadoop/hdfs/name</value> 

                <description>Path on thelocal filesystem where the NameNode stores the namespace and transactions logspersistently.</description> 

       </property> 

       <property> 

               <name>dfs.datanode.data.dir</name> 

                <value>/hadoop/hdfs/data</value> 

                <description>Commaseparated list of paths on the local filesystem of a DataNode where it shouldstore its blocks.</description> 

       </property> 

    <property> 

               <name>hadoop.tmp.dir</name> 

               <value>/hadoop/hdfs/tmp/hadoop-${user.name}</value> 

                <description>A base forother temporary directories.</description> 

       </property> 

</configuration>

 

Yarn-site.xml
[hadoop@hadoop01 hadoop]$ vi yarn-site.xml

 

<configuration> 

<property> 

   <name>yarn.resourcemanager.resource-tracker.address</name> 

   <value>hadoop01:8031</value> 

   <description>host is the hostname of the resource manager and 

   port is the port on which the NodeManagers contact the ResourceManager. 

   </description> 

 </property> 

 

 <property> 

   <name>yarn.resourcemanager.scheduler.address</name> 

   <value>hadoop01:8030</value> 

   <description>host is the hostname of the resourcemanager and portis the port 

    on which the Applications in the cluster talkto the Resource Manager. 

   </description> 

 </property> 

    <property> 

   <name>yarn.resourcemanager.scheduler.class</name> 

   <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value> 

   <description>In case you do not want to use the defaultscheduler</description> 

 </property> 

 

 <property> 

   <name>yarn.resourcemanager.address</name> 

   <value>hadoop01:8032</value> 

   <description>the host is the hostname of the ResourceManager andthe port is the port on 

   which the clients can talk to the Resource Manager.</description> 

 </property> 

 

 <property> 

   <name>yarn.nodemanager.local-dirs</name> 

   <value>${hadoop.tmp.dir}/nodemanager/local</value> 

   <description>the local directories used by thenodemanager</description> 

 </property> 

 

 <property> 

   <name>yarn.nodemanager.address</name> 

   <value>0.0.0.0:8034</value> 

   <description>the nodemanagers bind to thisport</description> 

 </property>  

 

 <property> 

   <name>yarn.nodemanager.resource.memory-mb</name> 

   <value>10240</value> 

   <description>the amount of memory on the NodeManager inGB</description> 

 </property> 

 

 <property> 

   <name>yarn.nodemanager.remote-app-log-dir</name> 

   <value>${hadoop.tmp.dir}/nodemanager/remote</value> 

   <description>directory on hdfs where the application logs aremoved to </description> 

 </property> 

 

  <property> 

   <name>yarn.nodemanager.log-dirs</name> 

   <value>${hadoop.tmp.dir}/nodemanager/logs</value> 

   <description>the directories used by Nodemanagers as logdirectories</description> 

 </property> 

 

 <property>

<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>

<value>mapreduce_shuffle</value>

 <description>shuffle service that needsto be set for Map Reduce to run </description>

 </property>

</configuration> 

 

 

Hadoop-env.sh
[hadoop@hadoop01 hadoop]$ vi hadoop-env.sh

增加

export JAVA_HOME=/usr/java/jdk1.7.0_45

配置SSH

每个节点

[root@hadoop01 ~]# su - hadoop

[hadoop@hadoop01 ~]$ssh-keygen -t rsa

 

节点1:

[hadoop@hadoop01 ~]$ cat~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

[hadoop@hadoop01 ~]$ sshhadoop02 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

[hadoop@hadoop01 ~]$ sshhadoop03 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

[hadoop@hadoop01 ~]$ chmod 600~/.ssh/authorized_keys

 

[hadoop@hadoop01 ~]$ scp~/.ssh/authorized_keys hadoop02:~/.ssh/authorized_keys

[hadoop@hadoop01 ~]$ scp ~/.ssh/authorized_keys hadoop03:~/.ssh/authorized_keys

 

所有节点:(必须做)

[hadoop@hadoop01 ~]$ chmod 600 ~/.ssh/authorized_keys

 

配置hadoop集群

slaves(只在主节点修改即可)
 

[hadoop@hadoop01 hadoop]$ pwd

/hadoop/hadoop-2.2.0/etc/hadoop

[hadoop@hadoop01 hadoop]$ vi slaves

 

192.168.20.39

192.168.20.40

格式化
 

[hadoop@hadoop01 hadoop]$ hdfs namenode -format

启动dfs(master节点执行)
Master节点执行就可以了

[hadoop@hadoop01 hadoop]$ start-dfs.sh

在master节点检查

[hadoop@hadoop01 hadoop]$ jps

8736 SecondaryNameNode

8477 NameNode

8834 Jps

在slaver节点检查

[hadoop@hadoop02 logs]$ jps

3387 DataNode

3608 Jps

[hadoop@hadoop03 ~]$ jps

2287 DataNode

2374 Jps

[hadoop@hadoop02 logs]$

启动yarn集群:

Master节点执行

[hadoop@hadoop01 hadoop]$ start-yarn.sh

[hadoop@hadoop01 hadoop]$ jps

8904 Jps

8736 SecondaryNameNode

8870 ResourceManager

8477 NameNode

 

Slave节点

 [hadoop@hadoop02logs]$ jps

3658 NodeManager

3387 DataNode

3688 Jps

[hadoop@hadoop03 ~]$ jps

2287 DataNode

2416 NodeManager

2517 Jps

 

检查结果:

http://192.168.20.38:8088

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: