DStream操作实战:1.SparkStreaming接受socket数据,实现单词计数WordCount
2018-03-11 22:06
591 查看
package cn.testdemo.dstream.socket
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.streaming.dstream.{DStream, ReceiverInputDStream}
//todo:利用sparkStreaming接受socket数据实现单词计数
object SparkStreamingSocket {
def main(args: Array[String]): Unit = {
//1、创建sparkConf
val sparkConf: SparkConf = new SparkConf().setAppName("SparkStreamingSocket").setMaster("local[2]")
//2、创建sparkContext
val sc = new SparkContext(sparkConf)
sc.setLogLevel("WARN")
//3、创建streamingContext,它需要2个参数,一个sparkContext,另一个是当前批次的时间间隔
val ssc = new StreamingContext(sc,Seconds(5))
//4、通过streamingcontext获取socket数据
val stream: ReceiverInputDStream[String] = ssc.socketTextStream("192.168.216.121",9999)
//5、切分每一行
val wordsDStream: DStream[String] = stream.flatMap(_.split(" "))
//6、每个单词计为1
val wordAndOneDStream: DStream[(String, Int)] = wordsDStream.map((_,1))
//7、相同单词出现的次数累加
val resultDstream: DStream[(String, Int)] = wordAndOneDStream.reduceByKey(_+_)
//8、打印输出
resultDstream.print()
//开启计算
ssc.start()
//等待退出
ssc.awaitTermination()
}
}
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.streaming.dstream.{DStream, ReceiverInputDStream}
//todo:利用sparkStreaming接受socket数据实现单词计数
object SparkStreamingSocket {
def main(args: Array[String]): Unit = {
//1、创建sparkConf
val sparkConf: SparkConf = new SparkConf().setAppName("SparkStreamingSocket").setMaster("local[2]")
//2、创建sparkContext
val sc = new SparkContext(sparkConf)
sc.setLogLevel("WARN")
//3、创建streamingContext,它需要2个参数,一个sparkContext,另一个是当前批次的时间间隔
val ssc = new StreamingContext(sc,Seconds(5))
//4、通过streamingcontext获取socket数据
val stream: ReceiverInputDStream[String] = ssc.socketTextStream("192.168.216.121",9999)
//5、切分每一行
val wordsDStream: DStream[String] = stream.flatMap(_.split(" "))
//6、每个单词计为1
val wordAndOneDStream: DStream[(String, Int)] = wordsDStream.map((_,1))
//7、相同单词出现的次数累加
val resultDstream: DStream[(String, Int)] = wordAndOneDStream.reduceByKey(_+_)
//8、打印输出
resultDstream.print()
//开启计算
ssc.start()
//等待退出
ssc.awaitTermination()
}
}
相关文章推荐
- DStream操作实战:2.SparkStreaming接受socket数据,实现单词计数累加
- Spark Streaming实现实时WordCount,DStream的使用,updateStateByKey(func)实现累计计算单词出现频率
- DStream操作实战:3.SparkStreaming开窗函数reduceByKeyAndWindow,实现单词计数
- 在idea上用SparkStreaming实现从远程socket读取数据并完成Wordcount
- spark-streaming 编程(二) word count单词计数统计
- Spark实现WordCount单词计数
- SparkStreaming案例:NetworkWordCount--ReceiverInputDstream的compute方法如何取得Socket预先存放在BlockManager中的数据
- 大数据IMF传奇行动绝密课程第103课:动手实战Spark Streaming Broadcast、Accumulator实现在线黑名单过滤和计数
- SparkStreaming案例:NetworkWordCount--ReceiverSupervisorImpl.onStart()如何将Reciver数据写到BlockManager中
- spark streaming 接收 kafka 数据java代码WordCount示例
- Java实现Spark词配对Wordcount计数
- Spark Java 单词计数(WordCount)
- java实现kafka整合spark streaming完成wordCount,updateStateByKey完成实时状态更新
- spark streaming 的wordcount程序,从hdfs上读取文件中的内容并计数
- SparkStreaming实现HDFS的wordCount(java版)
- java8实现spark streaming的wordcount
- 大数据IMF传奇行动绝密课程第94课:SparkStreaming实现广告计费系统中在线黑名单过滤实战
- kafka+sparkstreaming实现每批次的wordcount统计模版
- Spark Java 单词计数(WordCount)
- spark wordCount单词计数及原理解析