Python+Hadoop Streaming实现MapReduce(如何给map和reduce的脚本传递参数)
2017-09-10 11:25
871 查看
设置参数
#!/bin/bash
hadoop fs -rmr trans_record/result
hadoop jar ./hadoop-streaming-2.0.0-mr1-cdh4.7.0.jar \
-input $1 \
-output trans_record/result \
-file map.py \
-file reduce.py \
-mapper "python map.py" \
-reducer "python reduce.py" \
-jobconf mapred.reduce.tasks=1 \
-jobconf mapred.job.name="qianjc_trans_record" \
-cmdenv "card_start=$2" \
-cmdenv "card_last=$3" \
-cmdenv "trans_at=$4"
#!/usr/bin/env python
# vim: set fileencoding=utf-8
import sys
import os
def main():
card_start = os.environ.get('card_start')
card_last = os.environ.get('card_last')
trans_at = float(os.environ.get('trans_at'))
for line in sys.stdin:
detail = line.strip().split(',')
card = detail[0]
money = float(detail[17])
if trans_at == money and card_start == card[1 : 7] and card_last == card[-4 : ]:
print '%s\t%s' % (line.strip(), detail[1])
if __name__ == '__main__':
main()
#!/bin/bash
hadoop fs -rmr trans_record/result
hadoop jar ./hadoop-streaming-2.0.0-mr1-cdh4.7.0.jar \
-input $1 \
-output trans_record/result \
-file map.py \
-file reduce.py \
-mapper "python map.py" \
-reducer "python reduce.py" \
-jobconf mapred.reduce.tasks=1 \
-jobconf mapred.job.name="qianjc_trans_record" \
-cmdenv "card_start=$2" \
-cmdenv "card_last=$3" \
-cmdenv "trans_at=$4"
#!/usr/bin/env python
# vim: set fileencoding=utf-8
import sys
import os
def main():
card_start = os.environ.get('card_start')
card_last = os.environ.get('card_last')
trans_at = float(os.environ.get('trans_at'))
for line in sys.stdin:
detail = line.strip().split(',')
card = detail[0]
money = float(detail[17])
if trans_at == money and card_start == card[1 : 7] and card_last == card[-4 : ]:
print '%s\t%s' % (line.strip(), detail[1])
if __name__ == '__main__':
main()
相关文章推荐
- Python+Hadoop Streaming实现MapReduce(如何给map和reduce的脚本传递参数)
- Python+Hadoop Streaming实现MapReduce(如何给map和reduce的脚本传递参数)
- 【hadoop】如何向map和reduce脚本传递参数,加载文件和目录
- [MapReduce] 如何向map和reduce脚本传递参数,加载文件和目录
- [MapReduce] 如何向map和reduce脚本传递参数,加载文件和目录
- [MapReduce] 如何向map和reduce脚本传递参数,加载文件和目录
- (转)如何向map和reduce脚本传递参数
- 如何向map和reduce脚本传递参数,加载文件和目录
- HadoopMapReduce --Map-Reduce具体实现详解
- 用hadoop-streaming 运行python map-reduce程序
- hive语句嵌入python脚本(进行map和reduce,实现左外连接)
- Python+Hadoop Streaming实现MapReduce任务
- 实现向 python 脚本中传递列表,字典参数
- 实例讲解hadoop中的map/reduce查询(python语言实现)
- Hadoop中 使用自定义的Writable,作为value在map和reduce传递参数。
- 实例讲解hadoop中的map/reduce查询(python语言实现)
- hadoop编程中,给map或者reduce传递参数
- hive语句嵌入python脚本(进行map和reduce,实现左外连接)
- HadoopMapReduce -Map-Reduce具体实现详解
- Python+Hadoop Streaming实现MapReduce(word count)