基于Eclipse的Hadoop应用开发环境配置
2012-04-18 10:42
477 查看
概要:在eclipse环境下配置Hadoop的开发环境
环境: ubuntu8.04.4
eclipse:Release 3.7.2
Hadoop:hadoop-1.0.2
参考前辈资料:
http://www.cnblogs.com/flyoung2008/archive/2011/12/09/2281400.html
一、配置过程
1、先启动hadoop守护进程
root@localhost:/usr/local/hadoop-1.0.2# bin/hadoop namenode -format
root@localhost:/usr/local/hadoop-1.0.2# bin/start-all.sh
我不明白,为什么每次启动hadoop都要格式化文件系统。如果我不格式化,namenode就无法启动。为什么?
已解决:因为我以前配置伪分布式系统时,没有在conf/hadoop-site.xml中指定dfs.data.dir和dfs.name.dir,这样它默认就在/tmp目录下,重启电脑后,/tmp目录下的文件就自动删除了,从而导致每次都要重新格式化文件系统。修改hadoop-site.xml即可,详细见http://www.cnblogs.com/wly603/archive/2012/04/10/2441336.html(伪分布式系统的配置)。
dfs.data.dir:表示本地文件系统中用于存储data node数据块
dfs.name.dir: 表示本地文件系统中用于存储name table的路径
注意:一定要先在终端启动hadoop,我开始未启动,一直出现错误,如下:
控制台输出运行Log信息
运行结束后,查看运行结果,使用命令: bin/hadoop fs -ls /tmp/wordcount/out查看例子的输出结果,
输出结果为:
Found 2 items
-rw-r--r-- 3 gqy supergroup 0 2012-04-18 09:46 /tmp/wordcount/out/_SUCCESS
-rw-r--r-- 3 gqy supergroup 81 2012-04-18 09:46 /tmp/wordcount/out/part-r-00000
进一步查看运行结果:
命令:gqy@localhost:/tmp$ hadoop fs -cat /tmp/wordcount/out/part-r-00000
输出显示:
c 1
c++ 2
hadoop 2
hbase 1
helloworld 1
java 3
javascript 1
mapreduce 1
python 1
环境: ubuntu8.04.4
eclipse:Release 3.7.2
Hadoop:hadoop-1.0.2
参考前辈资料:
http://www.cnblogs.com/flyoung2008/archive/2011/12/09/2281400.html
一、配置过程
1、先启动hadoop守护进程
root@localhost:/usr/local/hadoop-1.0.2# bin/hadoop namenode -format
root@localhost:/usr/local/hadoop-1.0.2# bin/start-all.sh
我不明白,为什么每次启动hadoop都要格式化文件系统。如果我不格式化,namenode就无法启动。为什么?
已解决:因为我以前配置伪分布式系统时,没有在conf/hadoop-site.xml中指定dfs.data.dir和dfs.name.dir,这样它默认就在/tmp目录下,重启电脑后,/tmp目录下的文件就自动删除了,从而导致每次都要重新格式化文件系统。修改hadoop-site.xml即可,详细见http://www.cnblogs.com/wly603/archive/2012/04/10/2441336.html(伪分布式系统的配置)。
dfs.data.dir:表示本地文件系统中用于存储data node数据块
dfs.name.dir: 表示本地文件系统中用于存储name table的路径
注意:一定要先在终端启动hadoop,我开始未启动,一直出现错误,如下:
控制台输出运行Log信息
12/04/18 09:46:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 12/04/18 09:46:21 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). ****hdfs://localhost:9000/tmp/wordcount/word.txt 12/04/18 09:46:21 INFO input.FileInputFormat: Total input paths to process : 1 12/04/18 09:46:21 WARN snappy.LoadSnappy: Snappy native library not loaded 12/04/18 09:46:21 INFO mapred.JobClient: Running job: job_local_0001 12/04/18 09:46:21 INFO util.ProcessTree: setsid exited with exit code 0 12/04/18 09:46:21 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@155c37d 12/04/18 09:46:21 INFO mapred.MapTask: io.sort.mb = 100 12/04/18 09:46:22 INFO mapred.MapTask: data buffer = 79691776/99614720 12/04/18 09:46:22 INFO mapred.MapTask: record buffer = 262144/327680 12/04/18 09:46:22 INFO mapred.JobClient: map 0% reduce 0% 12/04/18 09:46:22 INFO mapred.MapTask: Starting flush of map output 12/04/18 09:46:23 INFO mapred.MapTask: Finished spill 0 12/04/18 09:46:23 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting 12/04/18 09:46:24 INFO mapred.LocalJobRunner: 12/04/18 09:46:24 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done. 12/04/18 09:46:24 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1bc4c37 12/04/18 09:46:24 INFO mapred.LocalJobRunner: 12/04/18 09:46:24 INFO mapred.Merger: Merging 1 sorted segments 12/04/18 09:46:24 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 119 bytes 12/04/18 09:46:24 INFO mapred.LocalJobRunner: 12/04/18 09:46:24 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting 12/04/18 09:46:24 INFO mapred.LocalJobRunner: 12/04/18 09:46:24 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now 12/04/18 09:46:24 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to hdfs://localhost:9000/tmp/wordcount/out 12/04/18 09:46:25 INFO mapred.JobClient: map 100% reduce 0% 12/04/18 09:46:27 INFO mapred.LocalJobRunner: reduce > reduce 12/04/18 09:46:27 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0' done. 12/04/18 09:46:28 INFO mapred.JobClient: map 100% reduce 100% 12/04/18 09:46:28 INFO mapred.JobClient: Job complete: job_local_0001 12/04/18 09:46:28 INFO mapred.JobClient: Counters: 22 12/04/18 09:46:28 INFO mapred.JobClient: File Output Format Counters 12/04/18 09:46:28 INFO mapred.JobClient: Bytes Written=81 12/04/18 09:46:28 INFO mapred.JobClient: FileSystemCounters 12/04/18 09:46:28 INFO mapred.JobClient: FILE_BYTES_READ=449 12/04/18 09:46:28 INFO mapred.JobClient: HDFS_BYTES_READ=172 12/04/18 09:46:28 INFO mapred.JobClient: FILE_BYTES_WRITTEN=81194 12/04/18 09:46:28 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=81 12/04/18 09:46:28 INFO mapred.JobClient: File Input Format Counters 12/04/18 09:46:28 INFO mapred.JobClient: Bytes Read=86 12/04/18 09:46:28 INFO mapred.JobClient: Map-Reduce Framework 12/04/18 09:46:28 INFO mapred.JobClient: Map output materialized bytes=123 12/04/18 09:46:28 INFO mapred.JobClient: Map input records=4 12/04/18 09:46:28 INFO mapred.JobClient: Reduce shuffle bytes=0 12/04/18 09:46:28 INFO mapred.JobClient: Spilled Records=18 12/04/18 09:46:28 INFO mapred.JobClient: Map output bytes=136 12/04/18 09:46:28 INFO mapred.JobClient: Total committed heap usage (bytes)=321003520 12/04/18 09:46:28 INFO mapred.JobClient: CPU time spent (ms)=0 12/04/18 09:46:28 INFO mapred.JobClient: SPLIT_RAW_BYTES=109 12/04/18 09:46:28 INFO mapred.JobClient: Combine input records=13 12/04/18 09:46:28 INFO mapred.JobClient: Reduce input records=9 12/04/18 09:46:28 INFO mapred.JobClient: Reduce input groups=9 12/04/18 09:46:28 INFO mapred.JobClient: Combine output records=9 12/04/18 09:46:28 INFO mapred.JobClient: Physical memory (bytes) snapshot=0 12/04/18 09:46:28 INFO mapred.JobClient: Reduce output records=9 12/04/18 09:46:28 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0 12/04/18 09:46:28 INFO mapred.JobClient: Map output records=13
运行结束后,查看运行结果,使用命令: bin/hadoop fs -ls /tmp/wordcount/out查看例子的输出结果,
输出结果为:
Found 2 items
-rw-r--r-- 3 gqy supergroup 0 2012-04-18 09:46 /tmp/wordcount/out/_SUCCESS
-rw-r--r-- 3 gqy supergroup 81 2012-04-18 09:46 /tmp/wordcount/out/part-r-00000
进一步查看运行结果:
命令:gqy@localhost:/tmp$ hadoop fs -cat /tmp/wordcount/out/part-r-00000
输出显示:
c 1
c++ 2
hadoop 2
hbase 1
helloworld 1
java 3
javascript 1
mapreduce 1
python 1
相关文章推荐
- 基于Eclipse的Hadoop应用开发环境配置
- 基于ECLIPSE的HADOOP1.0应用开发环境配置
- 基于Eclipse的Hadoop应用开发环境配置
- 基于Eclipse的Hadoop应用开发环境配置
- 基于Eclipse的Hadoop应用开发环境配置
- 基于Eclipse的Hadoop应用开发环境配置和范例
- 黑马程序员--基于Eclipse的Hadoop应用开发环境配置
- 基于Eclipse的Hadoop应用开发环境配置
- 基于Eclipse的Hadoop应用开发环境配置
- 基于Eclipse的Hadoop应用开发环境配置
- 基于Eclipse的Hadoop应用开发环境配置
- 基于Eclipse的Hadoop应用开发环境配置
- 基于Eclipse的Hadoop应用开发环境配置
- windows下基于Eclipse的Hadoop应用开发环境配置
- 基于Eclipse的Hadoop应用开发环境配置
- 基于Eclipse的Hadoop应用开发环境配置
- 基于ECLIPSE的HADOOP应用开发环境配置
- 基于Eclipse的Hadoop应用开发环境配置
- 基于Eclipse的Hadoop应用开发环境配置
- 在Linux下基于Eclipse的Hadoop应用开发环境配置