您的位置:首页 > 数据库

sparkSql dataframe反射创建

2016-07-12 00:00 288 查看

1.目的

反射方式从RDD创建dataframe

2.素材

text1.txt

1 tom 2 jack friend
2 jack 3 sala friend
3 sala 1 tom friend
4 joy 1 tom friend
1 tom 4 joy friend
1 tom 4 joy friend
2 jack 5 Missing friend


3.代码

/**
* Created by puwenchao on 2016-07-06.
*/
package test
import org.apache.log4j.{Level, Logger}
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.SQLContext

//1. 创建case class
case class person (aid:Int,aname:String,bid:Int,bname:String,rel:String)

object rdd2DF {
def main(args: Array[String]) ={
//屏蔽日志
Logger.getLogger("org.apache.spark").setLevel(Level.WARN)
Logger.getLogger("org.eclipse.jetty.server").setLevel(Level.OFF)
//创建上下文
val sparkConf=new SparkConf().setAppName("rdd2DF").setMaster("local")
val sc=new SparkContext(sparkConf)
val sqlContext=new SQLContext(sc)

//2. 创建case class格式的RDD
val text = sc.textFile("e:\\text1.txt")
.map(_.split(" "))
.map(p=>person(p(0).toInt,p(1),p(2).toInt,p(3),p(4)))

//3. 转换RDD为DataFrame.
import sqlContext.implicits._
val textdf=text.toDF()

//4. 注册dataframe为临时表
textdf.registerTempTable("text")

//执行SQL
val query=sqlContext.sql("select * from text a")

query.show()

sc.stop()
}
}


4.输出

+---+-----+---+-------+------+
|aid|aname|bid| bname| rel|
+---+-----+---+-------+------+
| 1| tom| 2| jack|friend|
| 2| jack| 3| sala|friend|
| 3| sala| 1| tom|friend|
| 4| joy| 1| tom|friend|
| 1| tom| 4| joy|friend|
| 1| tom| 4| joy|friend|
| 2| jack| 5|Missing|friend|
+---+-----+---+-------+------+
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息