hadoop权威指南上 天气例子测试运行
2017-10-15 17:25
183 查看
一、先代码准备好。 代码在本文后面
我的hadoop路劲是/Users/chenxun/software/hadoop-2.8.1 所以我在这个建了个自己文件夹myclass目录,把代码放到这个目录下面。如图所示:
二、配置代码编译环境classpath的值
配置好java环境和hadoop编译需要的hadoop依赖jar包
三、编译代码和打包成jar包
四、准备数据
在网站下载hadoop天气数据:ftp://ftp.ncdc.noaa.gov/pub/data/noaa/2010/
我把天气数据放到file.txt中:数据如下
0029227070999991901122820004+62167+030650FM-12+010299999V0200501N003119999999N0000001N9-01561+99999100061ADDGF108991999999999999999999
0029227070999991901122906004+62167+030650FM-12+010299999V0200901N003119999999N0000001N9-01501+99999100181ADDGF108991999999999999999999
0029227070999991901122913004+62167+030650FM-12+010299999V0200701N002119999999N0000001N9-01561+99999100271ADDGF104991999999999999999999
0029227070999991901122920004+62167+030650FM-12+010299999V0200701N002119999999N0000001N9-02001+99999100501ADDGF107991999999999999999999
0029227070999991901123006004+62167+030650FM-12+010299999V0200701N003119999999N0000001N9-01501+99999100791ADDGF108991999999999999999999
0029227070999991901123013004+62167+030650FM-12+010299999V0200901N003119999999N0000001N9-01331+99999100901ADDGF108991999999999999999999
0029227070999991901123020004+62167+030650FM-12+010299999V0200701N002119999999N0000001N9-01221+99999100831ADDGF108991999999999999999999
0029227070999991901123106004+62167+030650FM-12+010299999V0200701N004119999999N0000001N9-01391+99999100521ADDGF108991999999999999999999
0029227070999991901123113004+62167+030650FM-12+010299999V0200701N003119999999N0000001N9-01391+99999100321ADDGF108991999999999999999999
0029227070999991901123120004+62167+030650FM-12+010299999V0200701N004119999999N0000001N9-01391+99999100281ADDGF108991999999999999999999
建立hdfs数据输入文件路劲
把天气数据上传到数据输入路劲下面:
运行代码:
代码:
MaxTemperature.java
MaxTemperatureMapper.java
MaxTemperatureReducer.java
我的hadoop路劲是/Users/chenxun/software/hadoop-2.8.1 所以我在这个建了个自己文件夹myclass目录,把代码放到这个目录下面。如图所示:
[chenxun@chen.local 17:21 ~/software/hadoop-2.8.1/myclass]$ll total 64 -rw-r--r-- 1 chenxun staff 1017 10 15 15:36 MaxTemperature.java -rw-r--r-- 1 chenxun staff 977 10 15 15:39 MaxTemperatureMapper.java -rw-r--r-- 1 chenxun staff 579 10 15 15:39 MaxTemperatureReducer.java
二、配置代码编译环境classpath的值
配置好java环境和hadoop编译需要的hadoop依赖jar包
vim ~/.bash_profile JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_144.jdk/Contents/Home CLASSPAHT=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export HADOOP_HOME=/Users/chenxun/software/hadoop-2.8.1 export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin for f in $HADOOP_HOME/share/hadoop/common/hadoop-*.jar;do export CLASSPATH=$CLASSPATH:$f done for f in $HADOOP_HOME/share/hadoop/hdfs/hadoop-*.jar;do export CLASSPATH=$CLASSPATH:$f done for f in $HADOOP_HOME/share/hadoop/mapreduce/hadoop-*.jar;do export CLASSPATH=$CLASSPATH:$f done for f in $HADOOP_HOME/share/hadoop/yarn/hadoop-*.jar;do export CLASSPATH=$CLASSPATH:$f done export CLASSPATH=$CLASSPATH:$HADOOP_HOME/share/common/lib:$HADOOP_HOME/share/hdfs/lib:$HADOOP_HOME/share/mapreduce/lib:$HADOOP_HOME/share/tools/lib:$HADOOP_HOME/share/yarn/lib source ~/.bash_profile
三、编译代码和打包成jar包
javac *.java jar -cvf MaxTemperature.jar . [chenxun@chen.local 17:21 ~/software/hadoop-2.8.1/myclass]$ll total 64 -rw-r--r-- 1 chenxun staff 1413 10 15 15:40 MaxTemperature.class -rw-r--r-- 1 chenxun staff 6333 10 15 16:18 MaxTemperature.jar -rw-r--r-- 1 chenxun staff 1017 10 15 15:36 MaxTemperature.java -rw-r--r-- 1 chenxun staff 1876 10 15 15:40 MaxTemperatureMapper.class -rw-r--r-- 1 chenxun staff 977 10 15 15:39 MaxTemperatureMapper.java -rw-r--r-- 1 chenxun staff 1687 10 15 15:40 MaxTemperatureReducer.class -rw-r--r-- 1 chenxun staff 579 10 15 15:39 MaxTemperatureReducer.java
四、准备数据
在网站下载hadoop天气数据:ftp://ftp.ncdc.noaa.gov/pub/data/noaa/2010/
我把天气数据放到file.txt中:数据如下
0029227070999991901122820004+62167+030650FM-12+010299999V0200501N003119999999N0000001N9-01561+99999100061ADDGF108991999999999999999999
0029227070999991901122906004+62167+030650FM-12+010299999V0200901N003119999999N0000001N9-01501+99999100181ADDGF108991999999999999999999
0029227070999991901122913004+62167+030650FM-12+010299999V0200701N002119999999N0000001N9-01561+99999100271ADDGF104991999999999999999999
0029227070999991901122920004+62167+030650FM-12+010299999V0200701N002119999999N0000001N9-02001+99999100501ADDGF107991999999999999999999
0029227070999991901123006004+62167+030650FM-12+010299999V0200701N003119999999N0000001N9-01501+99999100791ADDGF108991999999999999999999
0029227070999991901123013004+62167+030650FM-12+010299999V0200901N003119999999N0000001N9-01331+99999100901ADDGF108991999999999999999999
0029227070999991901123020004+62167+030650FM-12+010299999V0200701N002119999999N0000001N9-01221+99999100831ADDGF108991999999999999999999
0029227070999991901123106004+62167+030650FM-12+010299999V0200701N004119999999N0000001N9-01391+99999100521ADDGF108991999999999999999999
0029227070999991901123113004+62167+030650FM-12+010299999V0200701N003119999999N0000001N9-01391+99999100321ADDGF108991999999999999999999
0029227070999991901123120004+62167+030650FM-12+010299999V0200701N004119999999N0000001N9-01391+99999100281ADDGF108991999999999999999999
建立hdfs数据输入文件路劲
[chenxun@chen.local 16:42 ~/software/hadoop-2.8.1/myclass]$hadoop fs -mkdir -p /user/chenxun/data [chenxun@chen.local 16:42 ~/software/hadoop-2.8.1/myclass]$hadoop fs -ls /user/chenxun Found 3 items drwxr-xr-x - chenxun supergroup 0 2017-10-15 16:42 /user/chenxun/data drwxr-xr-x - chenxun supergroup 0 2017-10-14 01:54 /user/chenxun/input drwxr-xr-x - chenxun supergroup 0 2017-10-14 01:55 /user/chenxun/output
把天气数据上传到数据输入路劲下面:
[chenxun@chen.local 16:47 ~/software/hadoop-2.8.1/myclass]$hadoop fs -put ./data/file.txt /user/chenxun/data [chenxun@chen.local 16:47 ~/software/hadoop-2.8.1/myclass]$hadoop fs -ls /user/chenxun/data Found 1 items -rw-r--r-- 1 chenxun supergroup 9855 2017-10-15 16:47 /user/chenxun/data/file.txt
运行代码:
[chenxun@chen.local 17:10 ~/software/hadoop-2.8.1/myclass]$hadoop jar MaxTemperature.jar MaxTemperature /user/chenxun/data/file.txt /user/chenxun/dataoutput 。。。 。。。。。 [chenxun@chen.local 17:11 ~/software/hadoop-2.8.1/myclass]$hadoop fs -ls /user/chenxun/dataoutput Found 2 items -rw-r--r-- 1 chenxun supergroup 0 2017-10-15 17:11 /user/chenxun/dataoutput/_SUCCESS -rw-r--r-- 1 chenxun supergroup 9 2017-10-15 17:11 /user/chenxun/dataoutput/part-r-00000 [chenxun@chen.local 17:11 ~/software/hadoop-2.8.1/myclass]$ [chenxun@chen.local 17:11 ~/software/hadoop-2.8.1/myclass]$ [chenxun@chen.local 17:12 ~/software/hadoop-2.8.1/myclass]$hadoop fs -cat /user/chenxun/dataoutput/part-r-00000 1901 -56
代码:
MaxTemperature.java
import java.io.IOException; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class MaxTemperature { public static void main(String[] args) throws Exception { if (args.length != 2) { System.err.println("Usage: MaxTemperature <input path> <output path>"); System.exit(-1); } Job job = new Job(); job.setJarByClass(MaxTemperature.class); job.setJobName("Max temperature"); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setMapperClass(MaxTemperatureMapper.class); job.setReducerClass(MaxTemperatureReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); System.exit(job.waitForCompletion(true) ? 0 : 1); } }
MaxTemperatureMapper.java
import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; public class MaxTemperatureMapper extends Mapper<LongWritable, Text, Text, IntWritable> { private static final int MISSING = 9999; @Override public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); String year = line.substring(15, 19); int airTemperature; if (line.charAt(87) == '+') { // parseInt doesn't like leading plus signs airTemperature = Integer.parseInt(line.substring(88, 92)); } else { airTemperature = Integer.parseInt(line.substring(87, 92)); } String quality = line.substring(92, 93); if (airTemperature != MISSING && quality.matches("[01459]")) { context.write(new Text(year), new IntWritable(airTemperature)); } } }
MaxTemperatureReducer.java
import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; public class MaxTemperatureReducer extends Reducer<Text, IntWritable, Text, IntWritable> { @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int maxValue = Integer.MIN_VALUE; for (IntWritable value : values) { maxValue = Math.max(maxValue, value.get()); } context.write(key, new IntWritable(maxValue)); } }
相关文章推荐
- mahout贝叶斯分类例子运行及测试异常处理
- 在Ubuntu下运行tinyhttpd及其测试例子
- ant笔记(一)初步使用:搭建环境和运行测试例子
- hadoop学习笔记(七)——hadoop权威指南中天气数据运行
- kaldi安装及运行测试例子
- 运行tuxedo自带例子simpapp,测试tuxedo安装
- Cocos2dx 3.6.1运行html5的测试例子
- Hadoop 伪分布式安装、运行测试例子
- Hadoop 伪分布式安装、运行测试例子
- 转载:c++读写文件和测试程序运行时间的例子
- c++读写文件和测试程序运行时间的例子
- Solr_4.5.0_02: 运行 solr 自带的 jetty 服务器 进行例子测试
- Hadoop 集群运行测试代码(Hadoop 权威指南天气数据示例)
- Intel和Microsoft C++编译器在矩阵乘法测试例子中运行时间的差异
- 简单测试运行时类信息(RTTI),附详细例子
- flask蓝图(Blueprint)简单测试例子运行不成功的原因:注册时机不对
- lazarus自带ssl例子运行测试
- 怎样在workbench 下运行eXtremeDB的测试例子
- Hadoop 集群运行测试代码(Hadoop 权威指南天气数据示例)
- 我们通过下面这个天气数据处理的例子来说明Hadoop的运行原理.