您的位置:首页 > 其它

RDD无reduceByKey方法

2015-05-05 16:40 429 查看
写Spark代码的时候经常发现rdd没有reduceByKey的方法,这个发生在spark1.2及其以前对版本,因为rdd本身不存在reduceByKey的方法,需要隐式转换成PairRDDFunctions才能访问,因此需要引入Import org.apache.spark.SparkContext._。

不过到了spark1.3的版本后,隐式转换的放在rdd的object中,这样就会自动被引入,不需要显式引入。

* Defines implicit functions that provide extra functionalities on RDDs of specific types.
* For example, [[RDD.rddToPairRDDFunctions]] converts an RDD into a [[PairRDDFunctions]] for
* key-value-pair RDDs, and enabling extra functionalities such as [[PairRDDFunctions.reduceByKey]].
*/

object RDD {
// The following implicit functions were in SparkContext before 1.3 and users had to
// `import SparkContext._` to enable them. Now we move them here to make the compiler find
// them automatically. However, we still keep the old functions in SparkContext for backward
// compatibility and forward to the following functions directly.
implicit def rddToPairRDDFunctions[K, V](rdd: RDD[(K, V)])
(implicit kt: ClassTag[K], vt: ClassTag[V], ord: Ordering[K] = null): PairRDDFunctions[K, V] = {
new PairRDDFunctions(rdd)
}


至于什么是隐式转换,简单来讲就是scala偷梁换柱换柱,让隔壁老王来干你干不了的事情了。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐