Flume向HDFS写数据实例
2013-01-09 00:00
525 查看
Goal:
Use Flume to pool a folder on local file system and write it to HDFS.Version Information:
hadoop-0.22.0apache-flume-1.3.1
Flume Configuration:
Edit fileflume-env.sh under
/$FLUME_HOME$/conf:
export JAVA_HOME=your jave home export FLUME_CLASSPATH=your flume home export HADOOP_CLASSPATH=your hadoop home
Edit file flume-conf.properties under /$FLUME_HOME$/conf:
# Configure the agent agent.sources = spooldirSource agent.channels = memoryChannel agent.sinks = hdfsSink # Configure the source agent.sources.spooldirSource.type = spooldir agent.sources.spooldirSource.spoolDir = /tmp/flume/ agent.sources.spooldirSource.channels = memoryChannel # Configure the sink agent.sinks.hdfsSink.type = hdfs agent.sinks.hdfsSink.hdfs.path = hdfs://masternode:9000/flume/events agent.sinks.hdfsSink.hdfs.filePrefix = events- agent.sinks.hdfsSink.channel = memoryChannel # Configure the channel agent.channels.memoryChannel.type = memory agent.channels.memoryChannel.capacity = 100
Copy Hadoop Jars to Flume lib directory:
Copy hadoop-hdfs-0.22.0.jar and hadoop-common-0.22.0.jar to /$FLUME_HOME$/lib.Start Flume Agent:
./bin/flume-ng agent -n agent -c conf -f conf/flume-conf.properties
Write File:
echo "Hello World">>/tmp/flume/test
View Logs:
Under /$FLUME_HOME$/logs相关文章推荐
- hadoop入门--通过Apache Flume向HDFS存储数据
- 大数据架构:flume-ng+Kafka+Storm+HDFS 实时系统组合
- 测试Flume-1.6.0写入HDFS(Hadoop-2.7.2)的简单实例
- 大数据架构:flume-ng+Kafka+Storm+HDFS 实时系统组合
- flume实现kafka到hdfs实时数据采集 - 有负载均衡策略
- 大数据架构:flume-ng+Kafka+Storm+HDFS 实时系统组合
- 大数据架构:flume-ng+Kafka+Storm+HDFS 实时系统组合
- 大数据架构:flume-ng+Kafka+Storm+HDFS 实时系统组合
- 大数据架构:flume-ng+Kafka+Storm+HDFS 实时系统组合
- flume的原理及介绍 线上数据-》flume-》kafka->hdfs/hadoop
- 大数据架构:flume-ng+Kafka+Storm+HDFS 实时系统组合
- 大数据架构:flume-ng+Kafka+Storm+HDFS 实时系统组合
- 利用Flume将MySQL表数据准实时抽取到HDFS、MySQL、Kafka
- 大数据架构:flume-ng+Kafka+Storm+HDFS 实时系统组合
- hadoop从入门到放弃(一)之flume获取数据存入hdfs
- flume学习(三):flume将log4j日志数据写入到hdfs(转)
- 大数据架构:flume-ng+Kafka+Storm+HDFS 实时系统组合
- Flume采集数据到HDFS时,生成的文件中,开头信息有乱码
- 大数据架构:flume-ng+Kafka+Storm+HDFS 实时系统组合