Demo of Python "Map Reduce Filter"
2015-07-08 19:03
761 查看
Here I share with you a demo for python map, reduce and filter functional programming thatowned by me(Xiaoqiang).
I assume there are two DB tables, that `file_logs` and `expanded_attrs` which records more columns to expand table `file_logs`. For demonstration, we assume that there are more than one file logs for a same tuple of (platform_id, client_id). We need to feture
out which is the one lasted updated for (platform_id=1, client_id=1) tuple.
Here is the thoughts:
1. Filter out all file logs for tuple (platform_id=1, client_id=1) from original file logs,
2. Merge expand table attributes into file_logs table in memory, like union selection.
3. Reduce the full version of file_logs for figuring out which is latest updated.
Demo codes shows here (use Python 2.6+, 2.7+):
BTW, you are welcome if you feature out a more effective way of working or any issues you found. Thanks. :)
I assume there are two DB tables, that `file_logs` and `expanded_attrs` which records more columns to expand table `file_logs`. For demonstration, we assume that there are more than one file logs for a same tuple of (platform_id, client_id). We need to feture
out which is the one lasted updated for (platform_id=1, client_id=1) tuple.
Here is the thoughts:
1. Filter out all file logs for tuple (platform_id=1, client_id=1) from original file logs,
2. Merge expand table attributes into file_logs table in memory, like union selection.
3. Reduce the full version of file_logs for figuring out which is latest updated.
Demo codes shows here (use Python 2.6+, 2.7+):
BTW, you are welcome if you feature out a more effective way of working or any issues you found. Thanks. :)
#!/usr/bin/env python """ Requirement: known platform_id=1, client_id=1 as pid and cid. exists file_logs and expanded_attrs which are array of objects, expanded_attrs is a table of columns expand table file_logs as file_logs contains more than one for pid=1,cid=1, we need to find out which is the one latest updated. """ file_logs = [ { 'file_log_id': '1', 'platform_id': '1', 'client_id': '1', 'file': 'path/to/platform/client/j-1/stdout' }, { 'file_log_id': '2', 'platform_id': '1', 'client_id': '1', 'file': 'path/to/platform/client/j-2/stdout' }, { 'file_log_id': '3', 'platform_id': '2', 'client_id': '3', 'file': 'path/to/platform/client/j-3/stdout' }, ] expanded_attrs = [ { 'file_log_id': '1', 'attr_name': 'CLICK', 'attr_value': '100' }, { 'file_log_id': '1', 'attr_name': 'SUPPRESSION', 'attr_value': '100' }, { 'file_log_id': '1', 'attr_name': 'last_updated', 'attr_value': '2014-07-14' }, { 'file_log_id': '2', 'attr_name': 'CLICK', 'attr_value': '200' }, { 'file_log_id': '2', 'attr_name': 'SUPPRESSION', 'attr_value': '200' }, { 'file_log_id': '2', 'attr_name': 'last_updated', 'attr_value': '2014-07-15' }, { 'file_log_id': '3', 'attr_name': 'CLICK', 'attr_value': '300' }, { 'file_log_id': '3', 'attr_name': 'SUPPRESSION', 'attr_value': '300' }, { 'file_log_id': '3', 'attr_name': 'last_updated', 'attr_value': '2014-07-15' }, ] platform_id = '1' client_id = '1' target_scope_filelogs = filter(lambda x: x['platform_id'] == platform_id and x['client_id'] == client_id, file_logs) map( lambda x: x.update(reduce( lambda xx, xy: xx.update({ xy['attr_name']: xy['attr_value'] }) is None and xx, filter(lambda xx: xx['file_log_id'] == x['file_log_id'], expanded_attrs), dict() )), target_scope_filelogs ) print reduce(lambda x, y: x['last_updated'] > y['last_updated'] and x or y, target_scope_filelogs) #> {'file_log_id': '2', 'platform_id': '1', 'last_updated': '2014-07-15', 'SUPPRESSION': '200', 'file': 'path/to/platform/client/j-2/stdout', 'client_id': '1', 'CLICK': '200'}
相关文章推荐
- python的学习
- Python安装PyGraphics包 (使用media模块)问题
- python super()
- python子类调用父类的方法-super
- Python记录日志的方法
- Python记录日志的方法
- python初学-03控制流程
- python strip()
- 在SAE(新浪云平台)上使用 python django库编写网站
- Python下的Request库
- python循环判断异常(异常处理)
- python 字符串模式匹配
- python调用SOA服务
- python捕捉对象(异常处理)
- python初学-02常用概念
- python初码
- 一句Python命令启动一个Web服务器
- python进阶一:高阶函数
- python六核心编程——条件和循环
- python 通过pywin32获取windows日志