您的位置：首页 > 编程语言 > Python开发

Python 读取大文件

2016-01-07 15:51 405 查看

最近在用python处理日志分析数据，但是有些文件比较大，几个G,如果用linecache 或都直接open整个文件，readlines 都容易导致占用过多内存，

导致程序停止执行。

Google了一下，python读取大文件的方法有以下几种：

个人推荐第一种，实测程序跑得刚刚的。

一、用with 读取大文件

with

读取是非常Pythonic的方法，示例如下：

with open(filepath) as f:
for line in f:
<do something with line>

这个方法是在Stackoverflow上找到，这位高手对

with

读取的解释是这样的：

The

with

statement handles opening and closing the file, including if an exception
is raised in the inner block. The

for line in f

treats the file object

as
an iterable, which automatically uses buffered IO and memory management so you don't have to worry about large files.

大意就是

with

负责处理open和close文件，包括抛出内部异常。而

for
line in f

将文件对象

当做迭代对象，将自动处理IO缓冲和内存管理，这样你无需担心大文件的处理了。

二、使用fileinput 模块

示例代码如下：

import fileinput
for line in fileinput.input(['sum.log']):
print line

第一种更Pythonic，也无需import，而且还能处理close和Exception，所以更推荐使用。

在文件处理时，读取整行，每行后会有一个换行符。python 其实也提供了一个方法，strip.

英文的意思如下：

vt. 剥夺；剥去；脱去衣服
n. 带；条状；脱衣舞
vi. 脱去衣服

line=line.strip('\n')
像这样，把换行符脱掉就OK了。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航