SparkStreaming在启动执行步鄹和DStream的理解
2017-07-21 14:52
309 查看
目录:
SparkStreaming启动执行步鄹:
DStream和Rdd的理解
Linux中发送数据
SparkStreaming代码如下
结果展示:
————————————————————————————————————————–
SparkStreaming启动执行步鄹:
1、加载SparkConf和StreamingContext2、建立DStream接收器val lines = ssc.socketTextStream(“192.168.200.11”, 9999)
3、lines就是一个DStream,每一个Duration就是一个RDD(一直循环RDD)
4、输出函数
5、开启SparkStreaming
6、一直循环第3,4步的操作
DStream和Rdd的理解
一个数据接收源就是一个DStream(本例子中lines),一个Duration就是一个RDD(本例子中10s),
下面例子中:
println(“===========Project Start=================”)
println(“===========Project End=================”)
只出现一次。而隔10秒,出现下面内容
println(“===========RDD Start=================”)
rdd.foreach(x=>println(x))
println(“===========RDD End=================”)
Linux中发送数据
* [root@hadoop11 ~]# nc -lk 9999 1111111111111111111 2222222222222222222 333333333333333333 444444444444444444 55555555555555555555 66666666666666666666 7777777777777777777777777 8888888888888888888888888 9999999999999999999999
SparkStreaming代码如下
package com.donews.spark2druid import org.apache.spark._ import org.apache.spark.streaming._ /** * Created by silentwolf on 2017/4/12. ***/ object SparkStreamTest { def main(args: Array[String]): Unit = { val conf = new SparkConf().setMaster("local[2]").setAppName("NetworkWordCount") val ssc = new StreamingContext(conf, Seconds(10)) val lines = ssc.socketTextStream("192.168.200.11", 9999) println("===========Project Start=================") lines.foreachRDD(rdd=>{ println("===========RDD Start=================") rdd.foreach(x=>println(x)) println("===========RDD End=================") }) println("===========Project End=================") ssc.start() // Start the computation ssc.awaitTermination() // Wait for the computation to terminate } }
结果展示:
17/04/13 13:41:16 INFO Remoting: Starting remoting 17/04/13 13:41:16 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.6.231:52549] 17/04/13 13:41:16 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 52549. 17/04/13 13:41:16 INFO SparkEnv: Registering MapOutputTracker 17/04/13 13:41:16 INFO SparkEnv: Registering BlockManagerMaster 17/04/13 13:41:16 INFO DiskBlockManager: Created local directory at C:\Users\silentwolf\AppData\Local\Temp\blockmgr-92abff6d-7d7e-483b-b3be-6a2993008fb0 17/04/13 13:41:16 INFO MemoryStore: MemoryStore started with capacity 1123.1 MB 17/04/13 13:41:16 INFO SparkEnv: Registering OutputCommitCoordinator 17/04/13 13:41:17 INFO Utils: Successfully started service 'SparkUI' on port 4040. 17/04/13 13:41:17 INFO SparkUI: Started SparkUI at http://192.168.6.231:4040 17/04/13 13:41:17 INFO Executor: Starting executor ID driver on host localhost 17/04/13 13:41:17 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 52560. 17/04/13 13:41:17 INFO NettyBlockTransferService: Server created on 52560 17/04/13 13:41:17 INFO BlockManagerMaster: Trying to register BlockManager 17/04/13 13:41:17 INFO BlockManagerMasterEndpoint: Registering block manager localhost:52560 with 1123.1 MB RAM, BlockManagerId(driver, localhost, 52560) 17/04/13 13:41:17 INFO BlockManagerMaster: Registered BlockManager ===========Project Start================= ===========Project End================= 17/04/13 13:41:18 INFO ReceiverTracker: Starting 1 receivers 17/04/13 13:41:18 INFO ReceiverTracker: ReceiverTracker started 17/04/13 13:41:18 INFO ForEachDStream: metadataCleanupDelay = -1 17/04/13 13:41:18 INFO SocketInputDStream: metadataCleanupDelay = -1 17/04/13 13:41:18 INFO SocketInputDStream: Slide time = 10000 ms 17/04/13 13:41:18 INFO SocketInputDStream: Storage level = StorageLevel(false, false, false, false, 1) 17/04/13 13:41:18 INFO SocketInputDStream: Checkpoint interval = null 17/04/13 13:41:18 INFO SocketInputDStream: Remember duration = 10000 ms 17/04/13 13:41:18 INFO SocketInputDStream: Initialized and validated org.apache.spark.streaming.dstream.SocketInputDStream@2f93fb19 17/04/13 13:41:18 INFO ForEachDStream: Slide time = 10000 ms 17/04/13 13:41:18 INFO ForEachDStream: Storage level = StorageLevel(false, false, false, false, 1) 17/04/13 13:41:18 INFO ForEachDStream: Checkpoint interval = null 17/04/13 13:41:18 INFO ForEachDStream: Remember duration = 10000 ms 17/04/13 13:41:18 INFO ForEachDStream: Initialized and validated org.apache.spark.streaming.dstream.ForEachDStream@667d4b62 17/04/13 13:41:18 INFO DAGScheduler: Got job 0 (start at SparkStreamTest.scala:30) with 1 output partitions 17/04/13 13:41:18 INFO DAGScheduler: Final stage: ResultStage 0 (start at SparkStreamTest.scala:30) 17/04/13 13:41:18 INFO DAGScheduler: Parents of final stage: List() 17/04/13 13:41:18 INFO DAGScheduler: Missing parents: List() 17/04/13 13:41:18 INFO ReceiverTracker: Receiver 0 started 17/04/13 13:41:18 INFO RecurringTimer: Started timer for JobGenerator at time 1492062080000 17/04/13 13:41:18 INFO JobGenerator: Started JobGenerator at 1492062080000 ms 17/04/13 13:41:18 INFO JobScheduler: Started JobScheduler 17/04/13 13:41:18 INFO DAGScheduler: Submitting ResultStage 0 (Receiver 0 ParallelCollectionRDD[0] at makeRDD at ReceiverTracker.scala:588), which has no missing parents 17/04/13 13:41:18 INFO StreamingContext: StreamingContext started 17/04/13 13:41:18 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 41.8 KB, free 41.8 KB) 17/04/13 13:41:18 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 13.7 KB, free 55.5 KB) 17/04/13 13:41:18 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:52560 (size: 13.7 KB, free: 1123.1 MB) 17/04/13 13:41:18 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006 17/04/13 13:41:18 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (Receiver 0 ParallelCollectionRDD[0] at makeRDD at ReceiverTracker.scala:588) 17/04/13 13:41:18 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 17/04/13 13:41:18 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 2561 bytes) 17/04/13 13:41:18 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 17/04/13 13:41:18 INFO RecurringTimer: Started timer for BlockGenerator at time 1492062078800 17/04/13 13:41:18 INFO BlockGenerator: Started BlockGenerator 17/04/13 13:41:18 INFO BlockGenerator: Started block pushing thread 17/04/13 13:41:18 INFO ReceiverTracker: Registered receiver for stream 0 from 192.168.6.231:52536 17/04/13 13:41:18 INFO ReceiverSupervisorImpl: Starting receiver 17/04/13 13:41:18 INFO ReceiverSupervisorImpl: Called receiver onStart 17/04/13 13:41:18 INFO ReceiverSupervisorImpl: Waiting for receiver to be stopped 17/04/13 13:41:18 INFO SocketReceiver: Connecting to 192.168.200.11:9999 17/04/13 13:41:18 INFO SocketReceiver: Connected to 192.168.200.11:9999 17/04/13 13:41:18 INFO MemoryStore: Block input-0-1492062078600 stored as bytes in memory (estimated size 48.0 B, free 55.5 KB) 17/04/13 13:41:18 INFO BlockManagerInfo: Added input-0-1492062078600 in memory on localhost:52560 (size: 48.0 B, free: 1123.1 MB) 17/04/13 13:41:18 WARN BlockManager: Block input-0-1492062078600 replicated to only 0 peer(s) instead of 1 peers 17/04/13 13:41:18 INFO BlockGenerator: Pushed block input-0-1492062078600 17/04/13 13:41:20 INFO JobScheduler: Added jobs for time 1492062080000 ms ===========RDD Start================= 17/04/13 13:41:20 INFO JobScheduler: Starting job streaming job 1492062080000 ms.0 from job set of time 1492062080000 ms 17/04/13 13:41:20 INFO SparkContext: Starting job: foreach at SparkStreamTest.scala:26 17/04/13 13:41:20 INFO DAGScheduler: Got job 1 (foreach at SparkStreamTest.scala:26) with 1 output partitions 17/04/13 13:41:20 INFO DAGScheduler: Final stage: ResultStage 1 (foreach at SparkStreamTest.scala:26) 17/04/13 13:41:20 INFO DAGScheduler: Parents of final stage: List() 17/04/13 13:41:20 INFO DAGScheduler: Missing parents: List() 17/04/13 13:41:20 INFO DAGScheduler: Submitting ResultStage 1 (BlockRDD[1] at socketTextStream at SparkStreamTest.scala:22), which has no missing parents 17/04/13 13:41:20 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 1088.0 B, free 56.6 KB) 17/04/13 13:41:20 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 790.0 B, free 57.4 KB) 17/04/13 13:41:20 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:52560 (size: 790.0 B, free: 1123.1 MB) 17/04/13 13:41:20 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006 17/04/13 13:41:20 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (BlockRDD[1] at socketTextStream at SparkStreamTest.scala:22) 17/04/13 13:41:20 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks 17/04/13 13:41:20 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, localhost, partition 0,NODE_LOCAL, 1936 bytes) 17/04/13 13:41:20 INFO Executor: Running task 0.0 in stage 1.0 (TID 1) 1111111111111111111 2222222222222222222 17/04/13 13:41:20 INFO BlockManager: Found block input-0-1492062078600 locally 17/04/13 13:41:20 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 915 bytes result sent to driver 17/04/13 13:41:20 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 22 ms on localhost (1/1) 17/04/13 13:41:20 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 17/04/13 13:41:20 INFO DAGScheduler: ResultStage 1 (foreach at SparkStreamTest.scala:26) finished in 0.027 s ===========RDD End================= 17/04/13 13:41:20 INFO DAGScheduler: Job 1 finished: foreach at SparkStreamTest.scala:26, took 0.046034 s 17/04/13 13:41:20 INFO JobScheduler: Finished job streaming job 1492062080000 ms.0 from job set of time 1492062080000 ms 17/04/13 13:41:20 INFO JobScheduler: Total delay: 0.107 s for time 1492062080000 ms (execution: 0.070 s) 17/04/13 13:41:20 INFO ReceivedBlockTracker: Deleting batches ArrayBuffer() 17/04/13 13:41:20 INFO InputInfoTracker: remove old batch metadata: 17/04/13 13:41:21 INFO MemoryStore: Block input-0-1492062081200 stored as bytes in memory (estimated size 25.0 B, free 57.4 KB) 17/04/13 13:41:21 INFO BlockManagerInfo: Added input-0-1492062081200 in memory on localhost:52560 (size: 25.0 B, free: 1123.1 MB) 17/04/13 13:41:21 WARN BlockManager: Block input-0-1492062081200 replicated to only 0 peer(s) instead of 1 peers 17/04/13 13:41:21 INFO BlockGenerator: Pushed block input-0-1492062081200 17/04/13 13:41:23 INFO MemoryStore: Block input-0-1492062083200 stored as bytes in memory (estimated size 25.0 B, free 57.4 KB) 17/04/13 13:41:23 INFO BlockManagerInfo: Added input-0-1492062083200 in memory on localhost:52560 (size: 25.0 B, free: 1123.1 MB) 17/04/13 13:41:23 WARN BlockManager: Block input-0-1492062083200 replicated to only 0 peer(s) instead of 1 peers 17/04/13 13:41:23 INFO BlockGenerator: Pushed block input-0-1492062083200 17/04/13 13:41:25 INFO MemoryStore: Block input-0-1492062085200 stored as bytes in memory (estimated size 27.0 B, free 57.4 KB) 17/04/13 13:41:25 INFO BlockManagerInfo: Added input-0-1492062085200 in memory on localhost:52560 (size: 27.0 B, free: 1123.1 MB) 17/04/13 13:41:25 WARN BlockManager: Block input-0-1492062085200 replicated to only 0 peer(s) instead of 1 peers 17/04/13 13:41:25 INFO BlockGenerator: Pushed block input-0-1492062085200 17/04/13 13:41:27 INFO MemoryStore: Block input-0-1492062087400 stored as bytes in memory (estimated size 27.0 B, free 57.5 KB) 17/04/13 13:41:27 INFO BlockManagerInfo: Added input-0-1492062087400 in memory on localhost:52560 (size: 27.0 B, free: 1123.1 MB) 17/04/13 13:41:27 WARN BlockManager: Block input-0-1492062087400 replicated to only 0 peer(s) instead of 1 peers 17/04/13 13:41:27 INFO BlockGenerator: Pushed block input-0-1492062087400 17/04/13 13:41:29 INFO MemoryStore: Block input-0-1492062089200 stored as bytes in memory (estimated size 32.0 B, free 57.5 KB) 17/04/13 13:41:29 INFO BlockManagerInfo: Added input-0-1492062089200 in memory on localhost:52560 (size: 32.0 B, free: 1123.1 MB) 17/04/13 13:41:29 WARN BlockManager: Block input-0-1492062089200 replicated to only 0 peer(s) instead of 1 peers 17/04/13 13:41:29 INFO BlockGenerator: Pushed block input-0-1492062089200 ===========RDD Start================= 17/04/13 13:41:30 INFO JobScheduler: Added jobs for time 1492062090000 ms 17/04/13 13:41:30 INFO JobScheduler: Starting job streaming job 1492062090000 ms.0 from job set of time 1492062090000 ms 17/04/13 13:41:30 INFO SparkContext: Starting job: foreach at SparkStreamTest.scala:26 17/04/13 13:41:30 INFO DAGScheduler: Got job 2 (foreach at SparkStreamTest.scala:26) with 5 output partitions 17/04/13 13:41:30 INFO DAGScheduler: Final stage: ResultStage 2 (foreach at SparkStreamTest.scala:26) 17/04/13 13:41:30 INFO DAGScheduler: Parents of final stage: List() 17/04/13 13:41:30 INFO DAGScheduler: Missing parents: List() 17/04/13 13:41:30 INFO DAGScheduler: Submitting ResultStage 2 (BlockRDD[2] at socketTextStream at SparkStreamTest.scala:22), which has no missing parents 17/04/13 13:41:30 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 1088.0 B, free 58.6 KB) 17/04/13 13:41:30 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 791.0 B, free 59.3 KB) 17/04/13 13:41:30 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:52560 (size: 791.0 B, free: 1123.1 MB) 17/04/13 13:41:30 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006 17/04/13 13:41:30 INFO DAGScheduler: Submitting 5 missing tasks from ResultStage 2 (BlockRDD[2] at socketTextStream at SparkStreamTest.scala:22) 17/04/13 13:41:30 INFO TaskSchedulerImpl: Adding task set 2.0 with 5 tasks 17/04/13 13:41:30 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, localhost, partition 0,NODE_LOCAL, 1936 bytes) 17/04/13 13:41:30 INFO Executor: Running task 0.0 in stage 2.0 (TID 2) 333333333333333333 17/04/13 13:41:30 INFO BlockManager: Found block input-0-1492062081200 locally 17/04/13 13:41:30 INFO Executor: Finished task 0.0 in stage 2.0 (TID 2). 915 bytes result sent to driver 17/04/13 13:41:30 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 3, localhost, partition 1,NODE_LOCAL, 1936 bytes) 17/04/13 13:41:30 INFO Executor: Running task 1.0 in stage 2.0 (TID 3) 17/04/13 13:41:30 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) in 10 ms on localhost (1/5) 444444444444444444 17/04/13 13:41:30 INFO BlockManager: Found block input-0-1492062083200 locally 17/04/13 13:41:30 INFO Executor: Finished task 1.0 in stage 2.0 (TID 3). 915 bytes result sent to driver 17/04/13 13:41:30 INFO TaskSetManager: Starting task 2.0 in stage 2.0 (TID 4, localhost, partition 2,NODE_LOCAL, 1936 bytes) 17/04/13 13:41:30 INFO Executor: Running task 2.0 in stage 2.0 (TID 4) 17/04/13 13:41:30 INFO TaskSetManager: Finished task 1.0 in stage 2.0 (TID 3) in 10 ms on localhost (2/5) 17/04/13 13:41:30 INFO BlockManager: Found block input-0-1492062085200 locally 55555555555555555555 17/04/13 13:41:30 INFO Executor: Finished task 2.0 in stage 2.0 (TID 4). 915 bytes result sent to driver 17/04/13 13:41:30 INFO TaskSetManager: Starting task 3.0 in stage 2.0 (TID 5, localhost, partition 3,NODE_LOCAL, 1936 bytes) 17/04/13 13:41:30 INFO Executor: Running task 3.0 in stage 2.0 (TID 5) 17/04/13 13:41:30 INFO BlockManager: Found block input-0-1492062087400 locally 17/04/13 13:41:30 INFO Executor: Finished task 3.0 in stage 2.0 (TID 5). 915 bytes result sent to driver 17/04/13 13:41:30 INFO TaskSetManager: Starting task 4.0 in stage 2.0 (TID 6, localhost, partition 4,NODE_LOCAL, 1936 bytes) 17/04/13 13:41:30 INFO Executor: Running task 4.0 in stage 2.0 (TID 6) 17/04/13 13:41:30 INFO TaskSetManager: Finished task 3.0 in stage 2.0 (TID 5) in 7 ms on localhost (3/5) 17/04/13 13:41:30 INFO TaskSetManager: Finished task 2.0 in stage 2.0 (TID 4) in 15 ms on localhost (4/5) 17/04/13 13:41:30 INFO BlockManager: Found block input-0-1492062089200 locally 17/04/13 13:41:30 INFO Executor: Finished task 4.0 in stage 2.0 (TID 6). 915 bytes result sent to driver 17/04/13 13:41:30 INFO TaskSetManager: Finished task 4.0 in stage 2.0 (TID 6) in 5 ms on localhost (5/5) 17/04/13 13:41:30 INFO DAGScheduler: ResultStage 2 (foreach at SparkStreamTest.scala:26) finished in 0.034 s 17/04/13 13:41:30 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 17/04/13 13:41:30 INFO DAGScheduler: Job 2 finished: foreach at SparkStreamTest.scala:26, took 0.102581 s 17/04/13 13:41:30 INFO JobScheduler: Finished job streaming job 1492062090000 ms.0 from job set of time 1492062090000 ms 66666666666666666666 17/04/13 13:41:30 INFO JobScheduler: Total delay: 0.121 s for time 1492062090000 ms (execution: 0.114 s) 7777777777777777777777777 ===========RDD End================= 17/04/13 13:41:30 INFO BlockRDD: Removing RDD 1 from persistence list 17/04/13 13:41:30 INFO SocketInputDStream: Removing blocks of RDD BlockRDD[1] at socketTextStream at SparkStreamTest.scala:22 of time 1492062090000 ms 17/04/13 13:41:30 INFO BlockManager: Removing RDD 1 17/04/13 13:41:30 INFO ReceivedBlockTracker: Deleting batches ArrayBuffer() 17/04/13 13:41:30 INFO InputInfoTracker: remove old batch metadata: 17/04/13 13:41:30 INFO BlockManagerInfo: Removed input-0-1492062078600 on localhost:52560 in memory (size: 48.0 B, free: 1123.1 MB) 17/04/13 13:41:31 INFO MemoryStore: Block input-0-1492062091400 stored as bytes in memory (estimated size 32.0 B, free 59.3 KB) 17/04/13 13:41:31 INFO BlockManagerInfo: Added input-0-1492062091400 in memory on localhost:52560 (size: 32.0 B, free: 1123.1 MB) 17/04/13 13:41:31 WARN BlockManager: Block input-0-1492062091400 replicated to only 0 peer(s) instead of 1 peers 17/04/13 13:41:31 INFO BlockGenerator: Pushed block input-0-1492062091400 17/04/13 13:41:34 INFO MemoryStore: Block input-0-1492062094200 stored as bytes in memory (estimated size 29.0 B, free 59.4 KB) 17/04/13 13:41:34 INFO BlockManagerInfo: Added input-0-1492062094200 in memory on localhost:52560 (size: 29.0 B, free: 1123.1 MB) 17/04/13 13:41:34 WARN BlockManager: Block input-0-1492062094200 replicated to only 0 peer(s) instead of 1 peers 17/04/13 13:41:34 INFO BlockGenerator: Pushed block input-0-1492062094200 ===========RDD Start================= 17/04/13 13:41:40 INFO JobScheduler: Added jobs for time 1492062100000 ms 17/04/13 13:41:40 INFO JobScheduler: Starting job streaming job 1492062100000 ms.0 from job set of time 1492062100000 ms 17/04/13 13:41:40 INFO SparkContext: Starting job: foreach at SparkStreamTest.scala:26 17/04/13 13:41:40 INFO DAGScheduler: Got job 3 (foreach at SparkStreamTest.scala:26) with 2 output partitions 17/04/13 13:41:40 INFO DAGScheduler: Final stage: ResultStage 3 (foreach at SparkStreamTest.scala:26) 17/04/13 13:41:40 INFO DAGScheduler: Parents of final stage: List() 17/04/13 13:41:40 INFO DAGScheduler: Missing parents: List() 17/04/13 13:41:40 INFO DAGScheduler: Submitting ResultStage 3 (BlockRDD[3] at socketTextStream at SparkStreamTest.scala:22), which has no missing parents 17/04/13 13:41:40 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 1088.0 B, free 60.4 KB) 17/04/13 13:41:40 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 791.0 B, free 61.2 KB) 17/04/13 13:41:40 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on localhost:52560 (size: 791.0 B, free: 1123.1 MB) 17/04/13 13:41:40 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1006 17/04/13 13:41:40 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 3 (BlockRDD[3] at socketTextStream at SparkStreamTest.scala:22) 17/04/13 13:41:40 INFO TaskSchedulerImpl: Adding task set 3.0 with 2 tasks 17/04/13 13:41:40 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 7, localhost, partition 0,NODE_LOCAL, 1936 bytes) 17/04/13 13:41:40 INFO Executor: Running task 0.0 in stage 3.0 (TID 7) 17/04/13 13:41:40 INFO BlockManager: Found block input-0-1492062091400 locally 8888888888888888888888888 17/04/13 13:41:40 INFO Executor: Finished task 0.0 in stage 3.0 (TID 7). 915 bytes result sent to driver 17/04/13 13:41:40 INFO TaskSetManager: Starting task 1.0 in stage 3.0 (TID 8, localhost, partition 1,NODE_LOCAL, 1936 bytes) 17/04/13 13:41:40 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 7) in 6 ms on localhost (1/2) 17/04/13 13:41:40 INFO Executor: Running task 1.0 in stage 3.0 (TID 8) 9999999999999999999999 17/04/13 13:41:40 INFO BlockManager: Found block input-0-1492062094200 locally 17/04/13 13:41:40 INFO Executor: Finished task 1.0 in stage 3.0 (TID 8). 915 bytes result sent to driver 17/04/13 13:41:40 INFO TaskSetManager: Finished task 1.0 in stage 3.0 (TID 8) in 7 ms on localhost (2/2) 17/04/13 13:41:40 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have all completed, from pool 17/04/13 13:41:40 INFO DAGScheduler: ResultStage 3 (foreach at SparkStreamTest.scala:26) finished in 0.012 s 17/04/13 13:41:40 INFO DAGScheduler: Job 3 finished: foreach at SparkStreamTest.scala:26, took 0.020244 s ===========RDD End================= 17/04/13 13:41:40 INFO JobScheduler: Finished job streaming job 1492062100000 ms.0 from job set of time 1492062100000 ms 17/04/13 13:41:40 INFO JobScheduler: Total delay: 0.054 s for time 1492062100000 ms (execution: 0.039 s) 17/04/13 13:41:40 INFO BlockRDD: Removing RDD 2 from persistence list 17/04/13 13:41:40 INFO BlockManager: Removing RDD 2 17/04/13 13:41:40 INFO SocketInputDStream: Removing blocks of RDD BlockRDD[2] at socketTextStream at SparkStreamTest.scala:22 of time 1492062100000 ms 17/04/13 13:41:40 INFO BlockManagerInfo: Removed input-0-1492062081200 on localhost:52560 in memory (size: 25.0 B, free: 1123.1 MB) 17/04/13 13:41:40 INFO BlockManagerInfo: Removed input-0-1492062083200 on localhost:52560 in memory (size: 25.0 B, free: 1123.1 MB) 17/04/13 13:41:40 INFO BlockManagerInfo: Removed input-0-1492062085200 on localhost:52560 in memory (size: 27.0 B, free: 1123.1 MB) 17/04/13 13:41:40 INFO ReceivedBlockTracker: Deleting batches ArrayBuffer(1492062080000 ms) 17/04/13 13:41:40 INFO InputInfoTracker: remove old batch metadata: 1492062080000 ms 17/04/13 13:41:40 INFO BlockManagerInfo: Removed input-0-1492062087400 on localhost:52560 in memory (size: 27.0 B, free: 1123.1 MB) 17/04/13 13:41:40 INFO BlockManagerInfo: Removed input-0-1492062089200 on localhost:52560 in memory (size: 32.0 B, free: 1123.1 MB)
相关文章推荐
- SparkStreaming在启动执行步鄹和DStream的理解
- 深入理解Spark Streaming执行模型
- Spark SQL深度理解篇:模块实现、代码结构及执行流程总览
- 通过案例对SparkStreaming透彻理解三板之二
- Spark源码定制第一课:通过案例对SparkStreaming透彻理解三板斧之一
- Spark Streaming揭秘 Day29 深入理解Spark2.x中的Structured Streaming
- 第1课:通过案例对Spark Streaming透彻理解三板斧之一
- 深入理解Spark 2.1 Core (七):Standalone模式任务执行的原理与源码分析
- (版本定制)第1课:Spark Streaming另类在线实验及Spark Streaming本质理解
- Spark定制版1:通过案例对SparkStreaming透彻理解三板斧之一
- Spark版本定制版3-通过案例对SparkStreaming透彻理解三板斧之三
- SparkStreaming案例:NetworkWordCount--ReceiverInputDstream的Receiver是如何被放到Executor上执行的?
- Spark(五) -- Spark Streaming介绍与基本执行过程
- 第2课:SparkStreaming 透彻理解三板斧之二:解密SparkStreaming运行机制和架构
- (版本定制)第2课:通过案例对SparkStreaming透彻理解之二
- Spark版本定制第2天:通过案例对SparkStreaming透彻理解之二
- Spark定制班第1课:通过案例对Spark Streaming透彻理解三板斧之一:解密Spark Streaming另类实验及Spark Streaming本质解析
- 第2课:通过案例对SparkStreaming 透彻理解三板斧之二:解密SparkStreaming运行机制和架构
- 通过案例对SparkStreaming透彻理解三板斧之二:解密SparkStreaming运行机制和架构进阶之运行机制和架构
- spark SQL读取ORC文件从Driver启动到开始执行Task(或stage)间隔时间太长(计算Partition时间太长)且产出orc单个文件中stripe个数太多问题解决方案