您的位置:首页 > 其它

使用MapReduce计算Pi

2017-03-11 23:48 525 查看

总体思路

核心思想是向以(0,0),(0,1),(1,0),(1,1)为顶点的正方形中投掷随机点。统计(0.5,0.5)为圆心的单位圆中落点占总落点数的百分比,即可算出单位圆的面积Pi/4,然后乘以4即得到Pi的近似值。

从输入文件中读入一行内容。每一行都是一个数字,代表随机投掷那么多点来估算Pi的值。在Mapper中则随机生成指定数量的随机点(x,y)。x和y的范围在0-1之间。然后求出(x,y)与(0.5,0.5)的距离。如果超过0.5,则输出

代码

代码实现如下:

package tech.mrbcy.bigdata.calpi;

import java.io.IOException;
import java.util.Random;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class CalPI {
public static class PiMapper
extends Mapper<Object, Text, Text, IntWritable>{

private static Random rd = new Random();

public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
int pointNum = Integer.parseInt(value.toString());

for(int i = 0; i < pointNum; i++){
// 取随机数
double x = rd.nextDouble();
double y = rd.nextDouble();
// 计算与(0.5,0.5)的距离,如果小于0.5就在单位圆里面
x -= 0.5;
y -= 0.5;
double distance = Math.sqrt(x*x + y*y);

IntWritable result = new IntWritable(0);
if (distance <= 0.5){
result = new IntWritable(1);
}

context.write(value, result);
}
}
}

public static class PiReducer
extends Reducer<Text,IntWritable,Text,DoubleWritable> {
private DoubleWritable result = new DoubleWritable();

public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {

double pointNum = Double.parseDouble(key.toString());
double sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum/pointNum*4);
context.write(key, result);
}
}

public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf,"calculate pi");
job.setJarByClass(CalPI.class);
job.setMapperClass(PiMapper.class);
//      job.setCombinerClass(PiReducer.class);
job.setReducerClass(PiReducer.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(DoubleWritable.class);

FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}

}


运行

首先把工程打成一个jar包导出。这一步我是用MyEclipse完成的。

然后使用下面的命令,来新建一个输入文件:

mkdir -p /root/homework/week2
vim /root/homework/week2/input.txt


在文件中输入下面的内容:

10000
9999
100000
99999
1000000
999999


然后使用下面的命令在HDFS中建立输入文件夹,并确保输出文件夹不存在:

hadoop fs -mkdir -p /calpi/input
hadoop fs -rm -r /calpi/output


将input.txt上传到HDFS中:

hadoop fs -put /root/homework/week2/input.txt /calpi/input


使用下面的命令运行MapReduce程序。

hadoop jar /root/homework/week2/calcpi.jar tech.mrbcy.bigdata.calpi.CalPI /calpi/input /calpi/output


等待执行结束后,使用下面的命令查看结果:

hadoop fs -cat /calpi/output/part-r-00000


输出结果如下:

10000   3.1476
100000  3.14544
1000000 3.144108
9999    3.128712871287129
99999   3.1406314063140632
999999  3.1423831423831423


总感觉精度不太高,于是我又编辑了input.txt,内容变为:

10000
9999
100000
99999
1000000
99999910000000
9999999
100000000
99999999


然后重新执行,输出结果为:

10000           3.1396
100000          3.13704
1000000         3.139504
10000000        3.1414536
100000000       3.14143428
9999            3.1347134713471347
99999           3.1425914259142593
999999          3.1421791421791423
9999999         3.1415555141555513
99999999        3.141412391414124


感觉基本近似了。目标基本完成。当然,在这个过程中,觉得MapReduce真的有点慢。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: