您的位置:首页 > 运维架构

Hadoop-2.5.1集群安装配置笔记

2016-03-17 11:15 369 查看
Hadoop-2.5.1集群安装配置笔记

1.环境

1.1.虚拟机

准备3台虚拟机,安装Centos 64-bit操作系统,采用最小安装。

(本来想多跑几台虚拟机,但本人笔记本电脑内存有限,最多只能同时跑3个虚拟机)

虚拟机一律配置静态IP地址,配置域名解析,各虚拟机时间同步。

192.168.17.100 nameNode

192.168.17.101 dataNode1

192.168.17.102 dataNode2

2.安装

2.1.安装前

2.1.1.安装wget和ssh

用于下载和ssh登录

yum -y install wget

yum -y install openssh*

2.1.2.安装JDK、配置环境变量

略…

2.1.3.配置ssh公钥密钥自动登录

在hadoop集群环境中,nameNode节点,需要能够ssh无密码登录访问dataNode节点

进入SSH目录:

[root@nameNode ~]# cd .ssh

[root@nameNode .ssh]#

生成公钥密钥对:

[root@nameNode /]# ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

98:3c:31:5c:23:21:73:a0:a0:1f:c6:d3:c3:dc:58:32 root@gifer

The key’s randomart image is:

+–[ RSA 2048]—-+

|. E.=.o |

|.o = @ o . |

|. * * = |

| o o o = |

| . = S |

| . |

| |

| |

| |

+—————–+

看到图形输出,表示密钥生成成功,目录下多出两个文件

私钥文件:id_raa

公钥文件:id_rsa.pub

将公钥文件id_rsa.pub内容放到authorized_keys文件中:

cat id_rsa.pub >> authorized_keys

将公钥文件authorized_keys分发到各dataNode节点:

scp authorized_keys root@dataNode:/root/.ssh/

验证ssh无密码登录:

[root@nameNode .ssh]# ssh root@dataNode1

Last login: Sun Sep 21 11:38:05 2014 from 192.168.17.1

看到以上输出,表示配置成功!如果还提示需要输出密码访问,表示配置失败!

2.2.开始安装

下载最新版本hadoop-2.5.1

wget http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.5.1/hadoop-2.5.1.tar.gz

解压

tar -zxf hadoop-2.5.1.tar.gz

2.3.配置文件

进入配置文件目录:cd hadoop-2.5.1/etc/hadoop

2.3.1.core-site.xml

hadoop.tmp.dir
/home/hadoop/tmp
Abase for other temporary directories.

fs.defaultFS
hdfs://nameNode:9000

io.file.buffer.size
4096


2.3.2.hdfs-site.xml

dfs.nameservices
hadoop-cluster1

dfs.namenode.secondary.http-address
nameNode:50090

dfs.namenode.name.dir
file:///home/hadoop/dfs/name

dfs.datanode.data.dir
file:///home/hadoop/dfs/data

dfs.replication
2

dfs.webhdfs.enabled
true


2.3.3.mapred-site.xml

mapreduce.framework.name
yarn

mapreduce.jobtracker.http.address
nameNode:50030

mapreduce.jobhistory.address
nameNode:10020

mapreduce.jobhistory.webapp.address
nameNode:19888


2.3.4.yarn-site.xml

yarn.nodemanager.aux-services
mapreduce_shuffle

yarn.resourcemanager.address
nameNode:8032

yarn.resourcemanager.scheduler.address
nameNode:8030

yarn.resourcemanager.resource-tracker.address
nameNode:8031

yarn.resourcemanager.admin.address
nameNode:8033

yarn.resourcemanager.webapp.address
nameNode:8088


2.3.5.slaves

dataNode1

dataNode2

2.3.6.修改JAVA_HOME

分别在文件hadoop-env.sh和yarn-env.sh中添加JAVA_HOME配置

vi hadoop-env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_65

vi yarn-env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_65

2.4.格式化文件系统

格式化文件系统:

bin/hdfs namenode -format

输出:

14/09/21 11:57:22 INFO namenode.NameNode: STARTUP_MSG:

14/09/21 11:57:22 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]

14/09/21 11:57:22 INFO namenode.NameNode: createNameNode [-format]

14/09/21 11:57:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable

Formatting using clusterid: CID-85e34a63-6cd7-4f1e-bbb9-add72ccaa660

14/09/21 11:57:29 INFO namenode.FSNamesystem: fsLock is fair:true

14/09/21 11:57:30 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000

14/09/21 11:57:30 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true

14/09/21 11:57:30 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000

14/09/21 11:57:30 INFO blockmanagement.BlockManager: The block deletion will start around 2014 九月 21 11:57:30

14/09/21 11:57:30 INFO util.GSet: Computing capacity for map BlocksMap

14/09/21 11:57:30 INFO util.GSet: VM type = 64-bit

14/09/21 11:57:30 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB

14/09/21 11:57:30 INFO util.GSet: capacity = 2^21 = 2097152 entries

14/09/21 11:57:31 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false

14/09/21 11:57:31 INFO blockmanagement.BlockManager: defaultReplication = 3

14/09/21 11:57:31 INFO blockmanagement.BlockManager: maxReplication = 512

14/09/21 11:57:31 INFO blockmanagement.BlockManager: minReplication = 1

14/09/21 11:57:31 INFO blockmanagement.BlockManager: maxReplicationStreams = 2

14/09/21 11:57:31 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks = false

14/09/21 11:57:31 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000

14/09/21 11:57:31 INFO blockmanagement.BlockManager: encryptDataTransfer = false

14/09/21 11:57:31 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000

14/09/21 11:57:31 INFO namenode.FSNamesystem: fsOwner = root (auth:SIMPLE)

14/09/21 11:57:31 INFO namenode.FSNamesystem: supergroup = supergroup

14/09/21 11:57:31 INFO namenode.FSNamesystem: isPermissionEnabled = true

14/09/21 11:57:31 INFO namenode.FSNamesystem: Determined nameservice ID: hadoop-cluster1

14/09/21 11:57:31 INFO namenode.FSNamesystem: HA Enabled: false

14/09/21 11:57:31 INFO namenode.FSNamesystem: Append Enabled: true

14/09/21 11:57:33 INFO util.GSet: Computing capacity for map INodeMap

14/09/21 11:57:33 INFO util.GSet: VM type = 64-bit

14/09/21 11:57:33 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB

14/09/21 11:57:33 INFO util.GSet: capacity = 2^20 = 1048576 entries

14/09/21 11:57:33 INFO namenode.NameNode: Caching file names occuring more than 10 times

14/09/21 11:57:33 INFO util.GSet: Computing capacity for map cachedBlocks

14/09/21 11:57:33 INFO util.GSet: VM type = 64-bit

14/09/21 11:57:33 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB

14/09/21 11:57:33 INFO util.GSet: capacity = 2^18 = 262144 entries

14/09/21 11:57:33 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033

14/09/21 11:57:33 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0

14/09/21 11:57:33 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000

14/09/21 11:57:33 INFO namenode.FSNamesystem: Retry cache on namenode is enabled

14/09/21 11:57:33 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis

14/09/21 11:57:33 INFO util.GSet: Computing capacity for map NameNodeRetryCache

14/09/21 11:57:33 INFO util.GSet: VM type = 64-bit

14/09/21 11:57:33 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB

14/09/21 11:57:33 INFO util.GSet: capacity = 2^15 = 32768 entries

14/09/21 11:57:33 INFO namenode.NNConf: ACLs enabled? false

14/09/21 11:57:33 INFO namenode.NNConf: XAttrs enabled? true

14/09/21 11:57:33 INFO namenode.NNConf: Maximum size of an xattr: 16384

14/09/21 11:57:34 INFO namenode.FSImage: Allocated new BlockPoolId: BP-955896090-127.0.0.1-1411271853454

14/09/21 11:57:34 INFO common.Storage: Storage directory /home/hadoop/dfs/name has been successfully formatted.

14/09/21 11:57:36 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0

14/09/21 11:57:36 INFO util.ExitUtil: Exiting with status 0

14/09/21 11:57:36 INFO namenode.NameNode: SHUTDOWN_MSG:

2.5.启动、停止服务

现在可以启动服务了

2.5.1.启动

[root@nameNode sbin]# ./start-dfs.sh

[root@nameNode sbin]# ./start-yarn.sh

2.5.2.停止

[root@nameNode sbin]# ./stop-dfs.sh

[root@nameNode sbin]# ./stop-yarn.sh

3.验证

3.1.查看启动的进程

[root@nameNode sbin]# jps

7854 Jps

7594 ResourceManager

7357 NameNode

3.2.通过浏览器访问

http://192.168.17.100:50070/

http://192.168.17.100:8088/
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  集群 虚拟机