flume与hdfs集成
2016-06-08 16:10
381 查看
1 flume配置文件 flume_hdfs.conf
#配置通道,其实就是临时存放位置
agent1.channels.ch1.type = memory
#配置来源
agent1.sources.tail.type = exec
agent1.sources.tail.channels = ch1
agent1.sources.tail.command = tail -f /usr/local/service/resin8082/log/jvm-default.log
agent1.sources.tail.fileHeader = false
#配置发送目的的
agent1.sinks.k1.type = hdfs
agent1.sinks.k1.channel = ch1
agent1.sinks.k1.hdfs.path = hdfs://ip:9000/testlog
agent1.sinks.k1.hdfs.filePrefix = events-
agent1.sinks.k1.hdfs.fileType = DataStream
agent1.sinks.k1.hdfs.writeFormat = Text
agent1.sinks.k1.hdfs.rollSize = 0
agent1.sinks.k1.hdfs.rollInterval= 0
agent1.sinks.k1.hdfs.rollCount = 600000
agent1.sinks.k1.hdfs.rollInterval = 600
agent1.sources = tail
agent1.channels = ch1
agent1.sinks=k1
启动
/opt/apache-flume-1.6.0-bin/bin/flume-ng agent -c conf -f /opt/apache-flume-1.6.0-bin/conf/flume_hdfs.conf -n agent1 -Dflume.root.logger=INFO,console
注意:如果flume就是下下来的使用,启动会没有反应,ahdoop都正常,通过命令也能够上次文件,肯定就是flume有问题
解决问题的方法:
1 修改log4j.properties,将info级别改为debug,然后启动,将会出现问题
2016-06-08 15:33:12,015 (conf-file-poller-0) [ERROR - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:145)] Failed to start agent because dependencies were not found
in classpath. Error follows.
java.lang.NoClassDefFoundError: org/apache/hadoop/io/SequenceFile$CompressionType
at org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:239)
表示hadoop一些jar包没有放进来
从hadoop安装包中hadoop-2.6.4\share\hadoop\common 下面的hadoop-common-2.6.4.jar 以及依赖lib下面的jar都放到flume的lib中,启动看看日志
2 启动后,如果出现
2016-06-08 16:01:10,477 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:455)] HDFS IO error
java.io.IOException: No FileSystem for scheme: hdfs
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
有可能就是没有放依赖包,根据报错的路径,org.apache.hadoop.fs.
,将hadoop-2.6.4\share\hadoop\hdfs下的hadoop-hdfs-2.6.4.jar 以及依赖 都放到flume的lib下 重启flume看看
最后没有报错,到hadoop里面去看文件,OK,表明以及配置完成
#配置通道,其实就是临时存放位置
agent1.channels.ch1.type = memory
#配置来源
agent1.sources.tail.type = exec
agent1.sources.tail.channels = ch1
agent1.sources.tail.command = tail -f /usr/local/service/resin8082/log/jvm-default.log
agent1.sources.tail.fileHeader = false
#配置发送目的的
agent1.sinks.k1.type = hdfs
agent1.sinks.k1.channel = ch1
agent1.sinks.k1.hdfs.path = hdfs://ip:9000/testlog
agent1.sinks.k1.hdfs.filePrefix = events-
agent1.sinks.k1.hdfs.fileType = DataStream
agent1.sinks.k1.hdfs.writeFormat = Text
agent1.sinks.k1.hdfs.rollSize = 0
agent1.sinks.k1.hdfs.rollInterval= 0
agent1.sinks.k1.hdfs.rollCount = 600000
agent1.sinks.k1.hdfs.rollInterval = 600
agent1.sources = tail
agent1.channels = ch1
agent1.sinks=k1
启动
/opt/apache-flume-1.6.0-bin/bin/flume-ng agent -c conf -f /opt/apache-flume-1.6.0-bin/conf/flume_hdfs.conf -n agent1 -Dflume.root.logger=INFO,console
注意:如果flume就是下下来的使用,启动会没有反应,ahdoop都正常,通过命令也能够上次文件,肯定就是flume有问题
解决问题的方法:
1 修改log4j.properties,将info级别改为debug,然后启动,将会出现问题
2016-06-08 15:33:12,015 (conf-file-poller-0) [ERROR - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:145)] Failed to start agent because dependencies were not found
in classpath. Error follows.
java.lang.NoClassDefFoundError: org/apache/hadoop/io/SequenceFile$CompressionType
at org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:239)
表示hadoop一些jar包没有放进来
从hadoop安装包中hadoop-2.6.4\share\hadoop\common 下面的hadoop-common-2.6.4.jar 以及依赖lib下面的jar都放到flume的lib中,启动看看日志
2 启动后,如果出现
2016-06-08 16:01:10,477 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:455)] HDFS IO error
java.io.IOException: No FileSystem for scheme: hdfs
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
有可能就是没有放依赖包,根据报错的路径,org.apache.hadoop.fs.
,将hadoop-2.6.4\share\hadoop\hdfs下的hadoop-hdfs-2.6.4.jar 以及依赖 都放到flume的lib下 重启flume看看
最后没有报错,到hadoop里面去看文件,OK,表明以及配置完成
相关文章推荐
- hadoop的hdfs文件操作实现上传文件到hdfs
- Flume环境部署和配置详解及案例大全
- java连接hdfs ha和调用mapreduce jar示例
- java实现将ftp和http的文件直接传送到hdfs
- Java访问Hadoop分布式文件系统HDFS的配置说明
- 在Hadoop2.5.0下利用Java读写HDFS
- HDFS 文件操作
- Spark中将对象序列化存储到hdfs
- 读<王垠:一种新的操作系统设计>
- Play! Akka Flume实现的完整数据收集
- log4j + flume 1.6 集成
- hadoop中RPC通信文件上传原理
- 测试Hadoop的hdfs的问题?
- flume自定义Interceptor
- 高可用,完全分布式Hadoop集群HDFS和MapReduce安装配置指南
- 使用Flume聚合Tomcat 日志
- hadoop特性讲解
- HDFS 恢复某时刻删除的文件
- #Note# Analyzing Twitter Data with Apache Hadoo...
- Hadoop HDFS Java API