您的位置:首页 > 运维架构

测试hadoop-1.2.1是否成功安装和配置

2015-07-19 20:38 387 查看
本文说明hadoop安装之后验证安装和配置的方法,hadoop-1.2.1安装方法参考:hadoop-1.2.1安装方法详解

hadoop安装成功之后,要简单验证是否成功安装和配置,在hadoop-1.2.1安装方法详解教程中,已经通过jps命令简单的验证,这里通过执行MapReduce作业统计单词来进一步验证reduce的配置是否正确,下面是验证方法:

1、为方便运行hadoop命令,首先配置一下hadoop的环境变量

打开hadoop用户目录下的 .bashrc 文件,添加修改以下内容:

export HADOOP_HOME=/home/hadoop/hadoop-1.2.1
export PATH=$PATH:$J***A_HOME/bin:$HADOOP_HOME/bin


配置之后注意要重新登录或者执行命令 source .bashrc 使配置生效

配置HADOOP_HOME之后,运行命令会报 ”Warning: $HADOOP_HOME is deprecated. “警告,

解决方法参考 hadoop1.2.1报Warning:
$HADOOP_HOME is deprecated. 的解决方法

2、在本地创建测试文件,为上传方便,我们先创建一个input文件,然后建立test1.txt和test2.txt文件

[hadoop@mdw temp]$ mkdir
input

[hadoop@mdw temp]$ cd
input/

[hadoop@mdw input]$ echo
"hello world hello hadoop" > test1.txt

[hadoop@mdw input]$ echo
"hello hadoop" > test2.txt

[hadoop@mdw input]$ ll

total 8

-rw-rw-r-- 1 hadoop hadoop 25 May 29 01:19 test1.txt

-rw-rw-r-- 1 hadoop hadoop 13 May 29 01:20 test2.txt

[hadoop@mdw input]$ cat
test1.txt

hello world hello hadoop

[hadoop@mdw input]$ cat
test2.txt

hello hadoop

红色内容为两个文本文件的内容,我们要通过MapReduce作业统计单词的个数

3、上传这两个文本文件到hdfs文件系统(我这里上传到hdfs文件系统的 in 文件夹中)

[hadoop@mdw input]$ hadoop
dfs -put ../input/ in

上传之后,查看hdfs文件系统

[hadoop@mdw input]$ hadoop
dfs -ls

Found 1 items

drwxr-xr-x - hadoop supergroup 0 2015-05-29 01:31 /user/hadoop/in

我就可以看到,/user/hadoop文件夹中多了一个in文件夹,执行命令查看文件内容如下

[hadoop@mdw input]$ hadoop
dfs -ls ./in

Found 2 items

-rw-r--r-- 2 hadoop supergroup 25 2015-05-29 01:31 /user/hadoop/in/test1.txt

-rw-r--r-- 2 hadoop supergroup 13 2015-05-29 01:31 /user/hadoop/in/test2.txt

如果上传时报以下错误:

put: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /user/hadoop/in. Name node is in safe mode.

运行命令关闭安全模式即可:

[hadoop@mdw input]$ hadoop
dfsadmin -safemode leave

4、运行hadoop自带的单词计数程序统计单词个数

[hadoop@mdw input]$ hadoop
jar ~/hadoop-1.2.1/hadoop-examples-1.2.1.jar wordcount in out

15/05/29 01:41:13 INFO input.FileInputFormat: Total input paths to process : 2

15/05/29 01:41:13 INFO util.NativeCodeLoader: Loaded the native-hadoop library

15/05/29 01:41:13 WARN snappy.LoadSnappy: Snappy native library not loaded

15/05/29 01:41:14 INFO mapred.JobClient: Running job: job_201505290130_0001

15/05/29 01:41:15 INFO mapred.JobClient: map 0% reduce 0%

15/05/29 01:41:19 INFO mapred.JobClient: map 50% reduce 0%

15/05/29 01:41:20 INFO mapred.JobClient: map 100% reduce 0%

15/05/29 01:41:26 INFO mapred.JobClient: map 100% reduce 33%

15/05/29 01:41:27 INFO mapred.JobClient: map 100% reduce 100%

15/05/29 01:41:27 INFO mapred.JobClient: Job complete: job_201505290130_0001

15/05/29 01:41:27 INFO mapred.JobClient: Counters: 29

15/05/29 01:41:27 INFO mapred.JobClient: Job Counters

15/05/29 01:41:27 INFO mapred.JobClient: Launched reduce tasks=1

15/05/29 01:41:27 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=5374

15/05/29 01:41:27 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0

15/05/29 01:41:27 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0

15/05/29 01:41:27 INFO mapred.JobClient: Launched map tasks=2

15/05/29 01:41:27 INFO mapred.JobClient: Data-local map tasks=2

15/05/29 01:41:27 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=8142

15/05/29 01:41:27 INFO mapred.JobClient: File Output Format Counters

15/05/29 01:41:27 INFO mapred.JobClient: Bytes Written=25

15/05/29 01:41:27 INFO mapred.JobClient: FileSystemCounters

15/05/29 01:41:27 INFO mapred.JobClient: FILE_BYTES_READ=68

15/05/29 01:41:27 INFO mapred.JobClient: HDFS_BYTES_READ=254

15/05/29 01:41:27 INFO mapred.JobClient: FILE_BYTES_WRITTEN=165604

15/05/29 01:41:27 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=25

15/05/29 01:41:27 INFO mapred.JobClient: File Input Format Counters

15/05/29 01:41:27 INFO mapred.JobClient: Bytes Read=38

15/05/29 01:41:27 INFO mapred.JobClient: Map-Reduce Framework

15/05/29 01:41:27 INFO mapred.JobClient: Map output materialized bytes=74

15/05/29 01:41:27 INFO mapred.JobClient: Map input records=2

15/05/29 01:41:27 INFO mapred.JobClient: Reduce shuffle bytes=74

15/05/29 01:41:27 INFO mapred.JobClient: Spilled Records=10

15/05/29 01:41:27 INFO mapred.JobClient: Map output bytes=62

15/05/29 01:41:27 INFO mapred.JobClient: CPU time spent (ms)=1960

15/05/29 01:41:27 INFO mapred.JobClient: Total committed heap usage (bytes)=337780736

15/05/29 01:41:27 INFO mapred.JobClient: Combine input records=6

15/05/29 01:41:27 INFO mapred.JobClient: SPLIT_RAW_BYTES=216

15/05/29 01:41:27 INFO mapred.JobClient: Reduce input records=5

15/05/29 01:41:27 INFO mapred.JobClient: Reduce input groups=3

15/05/29 01:41:27 INFO mapred.JobClient: Combine output records=5

15/05/29 01:41:27 INFO mapred.JobClient: Physical memory (bytes) snapshot=324222976

15/05/29 01:41:27 INFO mapred.JobClient: Reduce output records=3

15/05/29 01:41:27 INFO mapred.JobClient: Virtual memory (bytes) snapshot=1128259584

15/05/29 01:41:27 INFO mapred.JobClient: Map output records=6

注意:hadoop自带的hadoop-examples-1.2.1.jar这个jar文件在hadoop的安装目录下

通过上面的信息,我们可以看到,MapReduce任务成功执行,我们再次查看hdfs文件系统,会发现里面多了一个out文件夹

[hadoop@mdw input]$ hadoop
dfs -ls

Found 2 items

drwxr-xr-x - hadoop supergroup 0 2015-05-29 01:31 /user/hadoop/in

drwxr-xr-x - hadoop supergroup 0 2015-05-29 01:41 /user/hadoop/out

执行ls命令查看out文件夹里的内容:

[hadoop@mdw input]$ hadoop
dfs -ls ./out

Found 3 items

-rw-r--r-- 2 hadoop supergroup 0 2015-05-29 01:41 /user/hadoop/out/_SUCCESS

drwxr-xr-x - hadoop supergroup 0 2015-05-29 01:41 /user/hadoop/out/_logs

-rw-r--r-- 2 hadoop supergroup 25 2015-05-29 01:41 /user/hadoop/out/part-r-00000

MapReduce任务输出的数据主要存在part-r-00000文件中,我们可以执行命令查看这几个文件的具体内容:

[hadoop@mdw input]$ hadoop
dfs -cat ./out/*

hadoop 2

hello 3

world 1

cat: File does not exist: /user/hadoop/out/_logs

这里红色部分就是MapReduce的执行结果,两个文本文件中的单词被正确统计出来,到此已能充分说明hadoop已经成功安装配置
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: