hadoop 配置安装(分布式)
2010-06-30 11:40
375 查看
一.安装环境:Ubuntu Server 9.04
Node User Ip address 备注
NameNode nkwangtong 10.68.8.109 NameNode,Jobtracker为同一台主机
Jobtracker nkwangtong 10.68.8.109
Datanode nkwangtong 10.68.8.110
nkwangtong 10.68.8.111
nkwangtong 10.68.8.112
二.安装步骤
1.安装配置Java
2.下载Hadoop 0.20.2安装包
http://apache.etoak.com/hadoop/core/hadoop-0.20.2/hadoop-0.20.2.tar.gz
3.解压到/home/nkwangtong/下,改名为Hadoop并添加权限
sudo tar xzf hadoop-0.20.0.tar.gz //解压至当前路径
mv hadoop-0.20.0.tar.gz hadoop //重命名为hadoop
sudo chown -R nkwangtong:nkwangtong hadoop //添加权限
4.更新 hadoop 环境变量
vi hadoop/conf/hadoop-env.sh
将 #export JAVA_HOME=/usr/lib/jvm/java-6-sun
改为 export JAVA_HOME=/usr/lib/jvm/java-6-sun
5.修改 hosts,将 IP 与主机名对应上
sudo gedit /etc/hosts
添加如下四行
10.68.8.109 StorageServer1.Race StorageServer1
10.68.8.110 StorageServer2.Race StorageServer2
10.68.8.111 StorageServer3.Race StorageServer3
10.68.8.112 StorageServer4.Race StorageServer4
6.配置 ssh(保证 masters 无需密码可 SSH 到 slaves)
在master和slave上都执行:
ssh -keygen -t rsa
然后就会在/root/.ssh/下面产生id_ras.pub的证书文件
在slave上通过scp将Master机器上的这个文件拷贝到Slave上。
scp nkwangtong@StorageSever1:/home/nkwangtong/.ssh/id_rsa.pub /home/nkwangtong/.ssh/109_rsa_pub
cat /home/nkwangtong/.ssh 109_rsa_pub >> /home/nkwangtong/.ssh/authorized_keys
7.配置 conf/masters,conf/slaves
vi conf/masters
添加:StorageServer1
vi conf/slaves
添加:
StorageServer2
StorageServer3
StorageServer4
8.配置 core-site.xml、hdfs-site.xml、mapred-site.xml
conf/core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://StorageServer1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/nkwangtong/hadoop/tmp/</value>
</property>
</configuration>
conf/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/nkwangtong/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/nkwangtong/hdfs/data</value>
</property>
</configuration>
conf/mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>10.68.8.109:9001</value>
</property>
</configuration>
9.通过scp拷贝hadoop到所有slave上。
10.运行
1) 格式化分布式文件系统,在 NameNode 上
sudo bin/hadoop namenode -format
2)启动hadoop
bin/start-all.sh
3)将本地文件 a 上传到 hdfs:
bin/hadoop dfs -mkdir test1
bin/hadoop dfs -mkdir test2
bin/hadoop dfs -put ~/a test1/a
4)执行例子
bin/hadoop jar hadoop-*-examples.jar wordcount test1/ test2/
5)查看结果
bin/hadoop dfs -cat test2/part-r-00000
11.可能遇到的错误
1)/home/nkwangtong/hdfs/一定要设为可写,否则会出现错误。每次format之后都会变成不可写,需手动添加权限。
2)若出现Could only be replicated to 0 nodes, instead of 1 这个错误,把safemode置于off:
bin/hadoop dfsadmin -safemode leave
Node User Ip address 备注
NameNode nkwangtong 10.68.8.109 NameNode,Jobtracker为同一台主机
Jobtracker nkwangtong 10.68.8.109
Datanode nkwangtong 10.68.8.110
nkwangtong 10.68.8.111
nkwangtong 10.68.8.112
二.安装步骤
1.安装配置Java
2.下载Hadoop 0.20.2安装包
http://apache.etoak.com/hadoop/core/hadoop-0.20.2/hadoop-0.20.2.tar.gz
3.解压到/home/nkwangtong/下,改名为Hadoop并添加权限
sudo tar xzf hadoop-0.20.0.tar.gz //解压至当前路径
mv hadoop-0.20.0.tar.gz hadoop //重命名为hadoop
sudo chown -R nkwangtong:nkwangtong hadoop //添加权限
4.更新 hadoop 环境变量
vi hadoop/conf/hadoop-env.sh
将 #export JAVA_HOME=/usr/lib/jvm/java-6-sun
改为 export JAVA_HOME=/usr/lib/jvm/java-6-sun
5.修改 hosts,将 IP 与主机名对应上
sudo gedit /etc/hosts
添加如下四行
10.68.8.109 StorageServer1.Race StorageServer1
10.68.8.110 StorageServer2.Race StorageServer2
10.68.8.111 StorageServer3.Race StorageServer3
10.68.8.112 StorageServer4.Race StorageServer4
6.配置 ssh(保证 masters 无需密码可 SSH 到 slaves)
在master和slave上都执行:
ssh -keygen -t rsa
然后就会在/root/.ssh/下面产生id_ras.pub的证书文件
在slave上通过scp将Master机器上的这个文件拷贝到Slave上。
scp nkwangtong@StorageSever1:/home/nkwangtong/.ssh/id_rsa.pub /home/nkwangtong/.ssh/109_rsa_pub
cat /home/nkwangtong/.ssh 109_rsa_pub >> /home/nkwangtong/.ssh/authorized_keys
7.配置 conf/masters,conf/slaves
vi conf/masters
添加:StorageServer1
vi conf/slaves
添加:
StorageServer2
StorageServer3
StorageServer4
8.配置 core-site.xml、hdfs-site.xml、mapred-site.xml
conf/core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://StorageServer1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/nkwangtong/hadoop/tmp/</value>
</property>
</configuration>
conf/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/nkwangtong/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/nkwangtong/hdfs/data</value>
</property>
</configuration>
conf/mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>10.68.8.109:9001</value>
</property>
</configuration>
9.通过scp拷贝hadoop到所有slave上。
10.运行
1) 格式化分布式文件系统,在 NameNode 上
sudo bin/hadoop namenode -format
2)启动hadoop
bin/start-all.sh
3)将本地文件 a 上传到 hdfs:
bin/hadoop dfs -mkdir test1
bin/hadoop dfs -mkdir test2
bin/hadoop dfs -put ~/a test1/a
4)执行例子
bin/hadoop jar hadoop-*-examples.jar wordcount test1/ test2/
5)查看结果
bin/hadoop dfs -cat test2/part-r-00000
11.可能遇到的错误
1)/home/nkwangtong/hdfs/一定要设为可写,否则会出现错误。每次format之后都会变成不可写,需手动添加权限。
2)若出现Could only be replicated to 0 nodes, instead of 1 这个错误,把safemode置于off:
bin/hadoop dfsadmin -safemode leave
相关文章推荐
- Hadoop-2.2.0 + hbase-0.98.4-hadoop2 RedHat x64 伪分布式安装小札(及配置文件)
- Hadoop分布式集群安装配置步骤
- hadoop分布式安装及其集群配置笔记
- Hadoop伪分布式模式的安装和配置
- Hadoop安装教程_单机/伪分布式配置_Hadoop2.6.0/Ubuntu14.04
- Hadoop安装教程_单机/伪分布式配置_Hadoop2.6.0/Ubuntu14.04
- Ubuntu配置hadoop单机+伪分布式环境+eclipse--安装eclipse并配置插件(四)
- 安装配置Hadoop2.6.0(完全分布式)
- Hadoop安装教程_单机/伪分布式配置_Hadoop2.6.0/Ubuntu14.04
- Ubuntu14.04安装配置Hadoop2.6.0(完全分布式)与 wordcount实例运行
- Hadoop0.20.2 完全分布式安装和配置
- 最新版hadoop2.7.1单机版与伪分布式安装配置
- hadoop-3.0.0-beta1运维手册(004):安装分布式hdfs3.0.0-配置JDK、设置主机名
- Hadoop-2.8.0安装教程---分布式配置
- Hadoop安装教程_集群/分布式配置
- Hadoop安装教程_单机/伪分布式配置_Hadoop2.6.0/Ubuntu14.04
- Hadoop2.2.0安装配置手册!完全分布式Hadoop集群搭建过程~(心血之作啊~~)
- Hadoop2.7.2 Centos 完全分布式集群环境搭建 (2) - Hadoop安装与配置(完全分布式)
- Hadoop安装教程_单机/伪分布式配置
- Hadoop安装教程_单机/伪分布式配置_Hadoop2.6.0/Ubuntu14.04