Hadoop 用Eclipse来Mapreduce WordCount实战(1)
2017-03-03 11:58
281 查看
(一)官网下载http://www.eclipse.org/
(二)maven http://www.mvnrepository.com
选择对应的hadoop版本
拷贝对应的hadoop
(三)解压下载Eclipse
(3.1) 右键工程|Build Path|Configure Build Path
(3.2) 安装Hadoop插件
(a)关闭Eclipse; 将这个jar放入到D:\eclipse-jee-mars-2-win32\eclipse\plugins目录下
(b)启动Eclipse;Window|Preferences| (注意:Browse...选择当前Hadoop的目录)
(c)Window|Show View| Other
(d)必须先启动linux中的Hadoop
[hadoop@master-hadoop hadoop-2.4.1]$ sbin/start-dfs.sh
[hadoop@master-hadoop hadoop-2.4.1]$ sbin/start-yarn.sh
(e)单击右下角
小象;设置
New Hadoop Location....
(f)显示效果
(3.3) Hadoop中的bin中缺少编译文件
(a) 将winutils.ext文件复制到C:\hadoop-2.4.1\hadoop-2.4.1\bin目录下
(b)将Hadoop.dll 文件复制到 C:\Windows\System32目录下
(3.4)编写源代码
WordCountMapper类
WordCountReduce类
WordCountMapReduce类
(3.5)运行
效果如下:
注意:其实源代码可以优化的!
(二)maven http://www.mvnrepository.com
选择对应的hadoop版本
拷贝对应的hadoop
<dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-mapreduce-client-common</artifactId> <version>2.4.1</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-mapreduce-client-core</artifactId> <version>2.4.1</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>2.4.1</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-hdfs</artifactId> <version>2.4.1</version> </dependency>
(三)解压下载Eclipse
(3.1) 右键工程|Build Path|Configure Build Path
(3.2) 安装Hadoop插件
(a)关闭Eclipse; 将这个jar放入到D:\eclipse-jee-mars-2-win32\eclipse\plugins目录下
(b)启动Eclipse;Window|Preferences| (注意:Browse...选择当前Hadoop的目录)
(c)Window|Show View| Other
(d)必须先启动linux中的Hadoop
[hadoop@master-hadoop hadoop-2.4.1]$ sbin/start-dfs.sh
[hadoop@master-hadoop hadoop-2.4.1]$ sbin/start-yarn.sh
(e)单击右下角
小象;设置
New Hadoop Location....
(f)显示效果
(3.3) Hadoop中的bin中缺少编译文件
(a) 将winutils.ext文件复制到C:\hadoop-2.4.1\hadoop-2.4.1\bin目录下
(b)将Hadoop.dll 文件复制到 C:\Windows\System32目录下
(3.4)编写源代码
WordCountMapper类
package com.hlx.mapreduce.wc; import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; /** * 继承这个mapper LongWritable ==>long Text ===>String IntWritable==>int * * @author Administrator * */ public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> { /** * 重写这个方法 */ @Override protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, IntWritable>.Context context) throws IOException, InterruptedException { // TODO Auto-generated method stub // super.map(key, value, context); // 1) 获得每一行的数据 // hello hadoop String line = value.toString(); // 2)分割每一行的数据 //hello,hadoop String[] splits = line.split(" "); //3)遍历每一行的数据 //hello 1 //hadoop 1 for(String str :splits){ //context上下 文数据(key--value 每个单词输出1次) context.write(new Text(str), new IntWritable(1)); } } }
WordCountReduce类
package com.hlx.mapreduce.wc; import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; /** * 继承Reduce * Text ==>String * IntWritable ==>int * (输入(key,value),输出(key,value)) * @author Administrator * */ public class WordCountReduce extends Reducer<Text, IntWritable, Text, IntWritable> { // a 1 // b 1 // c 1 // hello{1,1,1}==> hello{3} ===>其实就是values @Override protected void reduce(Text key, Iterable<IntWritable> values, Reducer<Text, IntWritable, Text, IntWritable>.Context context) throws IOException, InterruptedException { int count=0; //累计和 //遍历数据 for(IntWritable value :values){ count +=value.get(); } //写入到上下文 context.write(key, new IntWritable(count)); } }
WordCountMapReduce类
package com.hlx.mapreduce.wc; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; /** * 测试类 * * @author Administrator * */ public class WordCountMapReduce { public static void main(String[] args) throws Exception { // 创建配置对象 Configuration conf = new Configuration(); // 创建job对象 Job job = Job.getInstance(conf, "wordcount0"); //设置运行的主类 job.setJarByClass(WordCountMapReduce.class); //设置map类 job.setMapperClass(WordCountMapper.class); //设置reduce类 job.setReducerClass(WordCountReduce.class); //设置map(key,value) job.setMapOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); //设置reduce(key,value) job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); //设置输入 输出路径 words=是输入文件夹中有个words文件; out3=是输出文件夹 FileInputFormat.setInputPaths(job, new Path("hdfs://master-hadoop.dragon.org:9000/words")); FileOutputFormat.setOutputPath(job, new Path("hdfs://master-hadoop.dragon.org:9000/out3")); //提交job boolean flag= job.waitForCompletion(true); if(!flag){ System.out.println("the task has failed!"); } } }
(3.5)运行
效果如下:
注意:其实源代码可以优化的!
相关文章推荐
- Hadoop 用Eclipse来MapReduce WordCount实战 (2)
- hadoop基础----hadoop实战(五)-----myeclipse开发MapReduce---WordCount例子---解析MapReduce的写法
- Eclipse下运行hadoop自带的mapreduce程序--wordcount
- Hadoop2.x实战:Eclipse本地开发环境搭建与本地运行wordcount实例
- hadoop学习之HDFS(2.1):linux下eclipse中配置hadoop-mapreduce开发环境并运行WordCount.java程序
- hadoop基础----hadoop实战(四)-----myeclipse开发MapReduce---myeclipse搭建hadoop开发环境并运行wordcount
- Hadoop实战-MapReduce之WordCount(五)
- Windows 使用Eclipse配置连接hadoop,编译运行MapReduce --本地调试WordCount
- hadoop基础----hadoop实战(三)-----hadoop运行MapReduce---对单词进行统计--经典的自带例子wordcount
- hadoop基础----hadoop实战(五)-----myeclipse开发MapReduce---WordCount例子---解析MapReduce的写法
- hadoop基础----hadoop实战(三)-----hadoop运行MapReduce---对单词进行统计--经典的自带例子wordcount
- hadoop实战 自己运行WordCount.java
- ubuntu系统下eclipse配置hadoop开发环境并运行wordcount程序
- HADOOP的学习笔记 (第四期) eclipse 执行 wordcount .
- Hadoop2.2.0源码分析(一)——Eclipse运行WordCount.java
- Hadoop + eclipse + linux 单机运行 WordCount
- Hadoop之道--MapReduce之Hello World实例wordcount
- Hadoop在Linux下伪分布式的安装以及wordcount实例的运行还有Eclipse的使用
- Hadoop之道--MapReduce之Hello World实例wordcount
- Hadoop MapReduce示例程序WordCount.java手动编译运行解析