Spark算子(一)
2017-07-15 13:30
253 查看
Point 1:UnionOperator
Point 2:TakeSample
Point 3:take
package com.spark.operator; import java.util.Arrays; import java.util.List; import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.JavaSparkContext; import org.apache.spark.api.java.function.VoidFunction; public class UnionOperator { public static void main(String[] args) { SparkConf conf = new SparkConf().setAppName("SampleOperator") .setMaster("local"); JavaSparkContext sc = new JavaSparkContext(conf); List<String> names = Arrays .asList("xurunyun", "liangyongqi", "wangfei","yasaka"); List<String> names1 = Arrays .asList("xurunyun", "liangyongqi2", "wangfei3","yasaka4"); JavaRDD<String> nameRDD = sc.parallelize(names,2); JavaRDD<String> nameRDD1 = sc.parallelize(names1,2); nameRDD.union(nameRDD1).foreach(new VoidFunction<String>() { private static final long serialVersionUID = 1L; @Override public void call(String arg0) throws Exception { System.out.println(arg0); } }); sc.close(); } }
Point 2:TakeSample
package com.spark.operator; import java.util.Arrays; import java.util.List; import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.JavaSparkContext; public class TakeSample { // takeSample = take + sample public static void main(String[] args) { SparkConf conf = new SparkConf().setAppName("SampleOperator") .setMaster("local"); JavaSparkContext sc = new JavaSparkContext(conf); List<String> names = Arrays .asList("xuruyun", "liangyongqi", "wangfei","xuruyun"); JavaRDD<String> nameRDD = sc.parallelize(names,1); List<String> list = nameRDD.takeSample(false,2); for(String name :list){ System.out.println(name); } sc.close(); } }
Point 3:take
package com.spark.operator; import java.util.Arrays; import java.util.List; import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.JavaSparkContext; public class TakeOperator { public static void main(String[] args) { SparkConf conf = new SparkConf().setAppName("ReduceOperator") .setMaster("local"); JavaSparkContext sc = new JavaSparkContext(conf); // 有一个集合,里面有1到10,10个数字,现在我们通过reduce来进行累加 List<Integer> numberList = Arrays.asList(1, 2, 3, 4, 5); JavaRDD<Integer> numbers = sc.parallelize(numberList); List<Integer> top3Numbers = numbers.take(3); for(Integer num:top3Numbers){ System.out.println(num); } sc.close(); } }
相关文章推荐
- Spark算子:RDD基本转换操作(4)–union、intersection、subtract
- Spark算子:RDD键值转换操作(2)–combineByKey、foldByKey
- Spark独到见解--Transformation算子总结
- Spark(五):RDD的创建方式及两类算子的简单使用
- Spark算子:RDD键值转换操作(3)–groupByKey、reduceByKey、reduceByKeyLocally
- Spark算子:统计RDD分区中的元素及数量
- Spark算子:RDD基本转换操作(1)–map、flagMap、distinct
- spark笔记二之Spark程序模型Transformation算子与action算子
- 【Spark】RDD操作详解4——Action算子
- Spark计算平台算子介绍与学习
- Spark算子:RDD键值转换操作(4)–cogroup/join
- Spark的算子的分类
- SPark算子学习之FlatMap和Glom和randomSplit
- spark算子图解
- Spark算子--SortByKey
- Spark算子--take、top、takeOrdered
- Spark算子:RDD基本转换操作(2)–coalesce、repartition
- (三)spark算子 分为3大类
- Spark算子:RDD基本转换操作(1)–map、flagMap、distinct
- Spark算子--RDD的基本转换