您的位置:首页 > 运维架构

hadoop开发环境搭建

2016-04-11 20:51 417 查看
一、系统环境

Linux:

主节点master:centos 6.5 x64 192.168.47.141

JDK:jdk-8u74-linux-x64-demos.rpm

hadoop:hadoop-2.7.0.tar

eclipse:eclipse-jee-mars-2-linux-gtk-x86_64.tar.gz

二、安装配置

1.解压



2.在桌面建立快捷方式



在桌面空白处右键点击,Create Launcher...



选择Type为应用程序,Name自定义为Eclipse,Command:点击Browse... 在文件系统中选择对应eclipse包里的eclipse执行文件,OK



3.安装hadoop插件

双击打开Eclipse,第一次启动会提示选择工作目录以保存以后项目路径



启动hadoop集群



检查集群启动情况



master主节点桌面在运行,忽略launcher这个进程。

复制hadoop的sclipse插件到plugins目录



重启Eclipse



打开Window->Preferences



在左侧目录选择 Hadoop Map/Reduce,在右侧选择hadoop的安装目录,OK

在Window->Show Vier下选择Other...



在Show View窗口中选择Mapreduce Tools 点击选择Localtion,OK



点击New hadoop Location...



配置Location



在Project explorer中查看DFS



可以看到文件夹,这里提示一个Error,一开始填错IP 后面直接填主机名

4.测试项目

上传一个文件到HDFS上

File->New->Other...



选择Map/Reduce Project Next



这里测试WordCount,项目名称自定义,location使用自定义 Finish



Yes



在WordCount下src,右键创建新的class



在WordCount.jav中填入如下代码

[code=plain]import


java.io.IOException;


import


java.util.StringTokenizer;


import


org.apache.hadoop.conf.Configuration;


import


org.apache.hadoop.fs.Path;


import


org.apache.hadoop.io.IntWritable;


import


org.apache.hadoop.io.Text;


import


org.apache.hadoop.mapreduce.Job;


import


org.apache.hadoop.mapreduce.Mapper;


import


org.apache.hadoop.mapreduce.Reducer;


import


org.apache.hadoop.mapreduce.lib.input.FileInputFormat;


import


org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;


import


org.apache.hadoop.util.GenericOptionsParser;
public class WordCount{

public


static


class


TokenizerMapper


extends


Mapper<Object, Text, Text, IntWritable>{


private


final


static


IntWritable one =

new


IntWritable(

1

);


private


Text word =

new


Text();


public


void


map(Object key, Text value, Context context


)

throws


IOException, InterruptedException {


System.out.println(

"key="


+key.toString());


System.out.println(

"Value="


+ value.toString());


StringTokenizer itr =

new


StringTokenizer(value.toString());


while


(itr.hasMoreTokens()){


word.set(itr.nextToken());


context.write(word, one);


}


}


}


public


static


class


IntSumReducer


extends


Reducer<Text,IntWritable,Text,IntWritable> {


private


IntWritable result =

new


IntWritable();


public


void


reduce(Text key, Iterable<IntWritable> values,


Context context


)

throws


IOException, InterruptedException {


int


sum =

0

;


for


(IntWritable val :values){


sum += val.get();


}


result.set(sum);


context.write(key, result);


}


}


public


static


void


main(String[] args)

throws


Exception {


Configuration conf =

new


Configuration();


System.out.println(

"url:"


+ conf.get(

"fs.default.name"

));


String[] otherArgs =

new


GenericOptionsParser(conf, args).getRemainingArgs();


if


(otherArgs.length !=

2

){


System.err.println(

"Usage:wordcount <in> <out>"

);


System.exit(

2

);


}


Job job =

new


Job(conf,

"word count"

);


job.setJarByClass(WordCount.

class

);


job.setMapperClass(TokenizerMapper.

class

);


job.setCombinerClass(IntSumReducer.

class

);


job.setReducerClass(IntSumReducer.

class

);


job.setOutputKeyClass(Text.

class

);


job.setOutputValueClass(IntWritable.

class

);


FileInputFormat.addInputPath(job,

new


Path(otherArgs[

0

]));


FileOutputFormat.setOutputPath(job,

new


Path(otherArgs[

1

]));


System.exit(job.waitForCompletion(

true

)? 

0


:

1

);


}


}

[/code]
没有错误下,点击绿色箭头run

Run on hadoop



OK

最大化控制台会发现有个错误,因为程序运行需要两个参数:输入文件 和 输出文件;点击绿色箭头,向下箭头



Run
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: