Spark学习笔记 --- SparkStreaming 实现对 TCP 数据源处理
2017-03-30 15:58
441 查看
package demo1 import org.apache.spark._ import org.apache.spark.streaming._ //import org.apache.spark.streaming.StreamingContext._ (spark1.3 upper is not necessary) /* Using this context, we can create a DStream that represents streaming data from a TCP source, specified as hostname (eg.localhost) and port.This lines DStream represents the stream of data that will be received from the data server. Each record in this DStream is a line of text. Next, we want to split the lines by space characters into words. flatMap is a one-to-many DStream operation that creates a new DStream by generating multiple new records from each record in the source DStream. In this case, each line will be split into multiple words and the stream of words is represented as the words DStream. Next, we want to count these words. */ object SparkStreaming { def main(args: Array[String]): Unit = { //Create a local StreamingContext with two working thread and batch interval of 1 second. //The mast requires 2 cores to prevent from a starvation scenario. val conf = new SparkConf().setMaster("local[2]").setAppName("NetworkWordCount") val ssc = new StreamingContext(conf, Seconds(1)) //Create a DStream that will connect to hostname:port. like localhost:9999 val lines = ssc.socketTextStream("localhost", 9999) //Split each line into words val words = lines.flatMap(_.split(" ")) //Count every word in each batch val pairs = words.map(word => (word, 1)) val wordCounts = pairs.reduceByKey(_+_) //Print the first ten element of each RDD generated in this DStream to the console wordCounts.print() //Start the computation ssc.start() //Wait for the computation to terminate ssc.awaitTermination() } }
相关文章推荐
- Spark学习笔记(4)Spark Streaming的Exactly-One的事务处理
- spark大数据处理技术读书笔记:spark streaming学习笔记
- Spark学习笔记(20)Spark Streaming中动态Batch Size实现初探
- Spark学习笔记(18)Spark Streaming中空RDD处理
- Spark学习笔记-Streaming-1
- asp.net实现页面的一般处理程序(CGI)学习笔记
- Scala中链式调用风格的实现代码实战及其在Spark编程中的广泛运用之Scala学习笔记-41
- 《TCP-IP详解 卷2:实现》学习笔记—mbuf的深入解析
- Spark Streaming:TCP(基本类型)数据源
- Spark学习笔记-Streaming-1<转>
- 第51讲:Scala中链式调用风格的实现代码实战及其在Spark编程中的广泛运用学习笔记
- Android开发学习笔记(8):浅谈Handler实现多线程和异步处理
- Silverlight学习笔记(七)-----Silverlight事件处理之鼠标事件实现简单拖拽
- struts2学习笔记--使用struts2插件实现ajax处理(返回json数据)
- Java学习笔记(一)------服务器&客户端一对一通信小程序实现(TCP)
- Spark学习笔记-Streaming-Flume
- EJB3.0学习笔记---多接口的时,实现类处理方法:
- php学习笔记(三十)ajax请求和接收参数的实现方式(包括json数据格式的简单处理)
- MODBUS学习笔记——modbus tk modbus TCP主机实现
- 学习笔记_JFame事件处理的三种方法(1)直接实现法