您的位置:首页 > 运维架构

[hadoop] 搭建自己的hadoop集群

2013-02-22 12:57 375 查看
1> 准备工作

a> 五台centos6.2虚拟机,配置主机名、IP地址、yum源、

192.168.68.201 master01

192.168.68.202 master02

192.168.68.203 slave01

192.168.68.204 slvae02

192.168.68.205 slave03

b> 准备所需要的软件包

jdk-6u26-linux-x64-rpm.bin

hadoop-0.20.2.tar.gz

2> 配置我自己的hadoop 集群

a> 修改5台机器的hosts文件

#vim /etc/hosts

192.168.68.201  master01
192.168.68.202  master02
192.168.68.203  slave01
192.168.68.204  slvae02
192.168.68.205  slave03


b> 配置master无密码登录slave,在master01和master02上执行以下命令:

#ssh-keygen

#ssh-copy-id -i .ssh/id_rsa.pub root@master01

#ssh-copy-id -i .ssh/id_rsa.pub root@master02

#ssh-copy-id -i .ssh/id_rsa.pub root@slave01

#ssh-copy-id -i .ssh/id_rsa.pub root@slave02

#ssh-copy-id -i .ssh/id_rsa.pub root@slave03

c> 5台机器分别安装java虚拟机,设置java环境变量

#./jdk-6u26-linux-x64-rpm.bin

#cat >>/etc/profile <<EOF

export JAVA_HOME=/usr/java/jdk1.6.0_26

export PATH=\$JAVA_HOME/bin:\$PATH

EOF

d> 在master01上解压hadoop-0.20.2.tar.gz 配置hadoop集群

#tar -zxvf hadoop-0.20.2.tar.gz

#vim hadoop-0.20.2/conf/hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.6.0_26
export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS"
export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS"
export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS"
export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS"

#vim hadoop-0.20.2/conf/hdfs-site.xml

<configuration>
<property>
<name>dfs.http.address</name>
<value>192.168.68.201:50070</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>${hadoop.tmp.dir}/dfs/name,/data/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/data/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>

#vim hadoop-0.20.2/conf/mapred-site.xml

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master02:8021</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/data/mapred/local</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/data/mapred/system</value>
</property>
<property>
<name>mapred.job.tracker.http.address</name>
<value>192.168.68.202:50030</value>
</property>
</configuration>

#vim hadoop-0.20.2/conf/core-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master01:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hadoop/tmp</value>
</property>
</configuration>
#vim hadoop-0.20.2/conf/masters

master02

#vim hadoop-0.20.2/conf/slaves

slave01
slave02
slave03


e> 将master01上的hadoop-0.20.2 复制到其他机器
#scp -r hadoop-0.20.2 root@master02:/root/
#scp -r hadoop-0.20.2 root@slave01:/root/
#scp -r hadoop-0.20.2 root@slave02:/root/
#scp -r hadoop-0.20.2 root@slave03:/root/


3> 启动hadoop集群

a> 在master01上格式化hdfs系统

#./hadoop-0.20.2/bin/hadoop namenode -format

b> 在master01上启动hdfs系统 ,执行jps查看启动的进程

#./hadoop-0.20.2/bin/start-dfs.sh

#jps

1872 Jps

1654 NameNode

c> 在master02上启动MapReduce,执行jps查看启动的进程

#./hadoop-0.20.2/bin/start-mapred.sh

#jps

1956 Jps

1737 SecondaryNameNode

1895 JobTracker

d> 查看slave01上启动的进程

#jps

2418 Jps

1758 TaskTracker

1827 DataNode
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: