您的位置:首页 > 运维架构

Spark Streaming 监控HDFS目录

2017-12-19 12:12 274 查看
package org.lm.spark.streaming

import org.apache.spark.SparkConf
import org.apache.spark.streaming.{Seconds, StreamingContext}

object StreamingWordCountOnLine {
def main(args: Array[String]): Unit = {
val conf=new SparkConf().setAppName("Streaming Word Count OnLine").setMaster("spark://192.168.189.128:7077")
val ssc=new StreamingContext(conf,Seconds(10))
val lines=ssc.textFileStream("hdfs://192.168.189.128:9000/user/StreamingText")
val words=lines.flatMap(_.split(" "))
val pairs=words.map(word=>(word,1))
val wordcounts=pairs.reduceByKey(_+_)
wordcounts.print()
ssc.start()
ssc.awaitTermination()
}

}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: