您的位置：首页 > 编程语言 > Python开发

Python全栈（一）基础之16.文件操作

2020-01-11 13:25 330 查看

文章目录

文件简介及打开
1.文件简介
2.文件打开

二、文件关闭

三、较大文件的读取

四、其他的读取方式

1.readline()函数
2.readlines()函数

文件简介及打开

1.文件简介

文件，即File，在Java中称为I/O(input/output)。
通过Python对计算机中的各种文件进行增删改查等操作。
操作文件的步骤：
打开文件；
对文件进行各种操作，主要是读、写；
关闭。

2.文件打开

open(file, mode=‘r’, buffering=None, encoding=None, errors=None, newline=None, closefd=True):
参数：
file 要打开的文件的名字（路径）。

file_name = 'test.txt'
open(file_name)

无输出，说明成功。
返回值：返回的是一个对象，这个对象代表了当前打开的文件。

file_name = 'test.txt'
f_obj= open(file_name)
print(f_obj)

打印

<_io.TextIOWrapper name='test.txt' mode='r' encoding='cp936'>

如果目标文件和当前文件在同一级目录下，则直接使用文件名即可打开；
如果在不同位置，则需要使用绝对路径。

二、文件关闭

read()方法用来读取文件的内容，它会将内容全部保存到一个字符串返回；
close()方法用来关闭文件。

file_name = 'test.txt'
f_obj= open(file_name)
content = f_obj.read()
print(content)
#关闭文件
f_obj.close()

打印

import android.content.Context;
import android.graphics.Point;
import android.os.Build;
import android.view.WindowManager;

如果关闭之后再次读文件，会报错

file_name = 'test.txt'
f_obj= open(file_name)
content = f_obj.read()
print(content)
#关闭文件
f_obj.close()f_obj.read()

打印

Traceback (most recent call last):
File "xxx.py", line 7, in <module>
f_obj.read()
import android.content.Context;
ValueError: I/O operation on closed file.
import android.graphics.Point;
import android.os.Build;
import android.view.WindowManager;

即抛出

ValueError

。
由此可看出，Python打开、操作、关闭文件可能会存在隐患，所以引入

with...as...

语句来避免这个问题。
在wth与剧中可以直接使用文件对象来操作文件，如

with open('test.txt') as f_obj:
print(f_obj.read())

结果与之前相同，但是好处是with语句相当于，此时这个文件只能在with代码块中使用，一旦with代码块结束则自动关闭。
并且可以用异常处理来捕捉文件处理异常。

file_name = 'demo.txt'
try:
with open(file_name) as file_obj:
print(file_obj.read())
except FileNotFoundError:
print(file_name,'not exists')

打印

demo.txt not exists

。
即可总结出标准的操作文件的代码：

try:
with open(file_name) as file_obj:
pass
except FileNotFoundError:
pass

三、较大文件的读取

通过read()来读取文件的内容；
通过open()来打开一个文件。
可以将文件分成两种类型：
纯文本文件（使用utf-8等编码方式写的文本文件）；
二进制文件（图片、音频、视频等文件）。
open()函数打开文件时，默认是以纯文本的形式打开的，即参数中默认编码格式为None。
所以处理文本文件时，有时需要指定文件编码，特别是打开含有中文的文本时，要指定编码格式，才能正常读取。

file_name = 'test2.txt'
try:
with open(file_name,encoding='utf-8') as file_obj:
content = file_obj.read()
print(content)
except FileNotFoundError:
print(file_name,'not exists')

打印

暮晓春来迟
先于百花知
岁岁种桃树开在断肠时

如果直接调用read()函数，会直接将文本内容全部读取，在要读取的文件比较大、内容比较多时，会一次性将文本的内容加载到内存中，容易导致内存泄漏。
用help()函数查看

file_name = 'test2.txt'
try:
with open(file_name,encoding='utf-8') as file_obj:
help(file_obj.read)
except FileNotFoundError:
print(file_name,'not exists')

打印

Help on built-in function read:

read(size=-1, /) method of _io.TextIOWrapper instance
Read at most n characters from stream.

Read from underlying buffer until we have n characters or we hit EOF.
If n is negative or omitted, read until EOF.

有

read(size=-1, /)

，指定了读取的字符数。
即read()函数可以接收一个参数size作为参数，来指定要读取的字符数。
如

file_name = 'test2.txt'
try:
with open(file_name,encoding='utf-8') as file_obj:
content = file_obj.read(6)
print(content)
except FileNotFoundError:
print(file_name,'not exists')

打印

暮晓春来迟

。
可以为size指定一个值，这样read()函数会读取指定数量的字符，每次读取都是从上次读取到的位置开始读取；
如果字符数小于size，则会读取剩余所有的字符。

file_name = 'test2.txt'
try:
with open(file_name,encoding='utf-8') as file_obj:
content1 = file_obj.read(6)
content2 = file_obj.read(6)
content3 = file_obj.read(6)
content4 = file_obj.read(10)
print(content1)
print(content2)
print(content3)
print(content4)
except FileNotFoundError:
print(file_name,'not exists')

打印

暮晓春来迟

先于百花知

岁岁种桃树

开在断肠时

可以定义一个变量，用来每次读取字符的数量，并创建一个循环来读取文件的内容，如下

file_name = 'test2.txt'
count = 6
try:
with open(file_name,encoding='utf-8') as file_obj:
while True:
content = file_obj.read(count)
#设置循环停止条件
if not content:
break
print(content)
except FileNotFoundError:
print(file_name,'not exists')

结果与前者相同，显然代码更简洁。
还可以定义一个变量，来用来保存文件内容，以实现在循环外部打印内容。

file_name = 'test2.txt'
count = 6
file_content = ''
try:
with open(file_name,encoding='utf-8') as file_obj:
while True:
content = file_obj.read(count)
#设置循环停止条件
if not content:
break
file_content += content
print(file_content)
except FileNotFoundError:
print(file_name,'not exists')

打印

暮晓春来迟
先于百花知
岁岁种桃树开在断肠时

对于文件内容读取，分批读取，比如按行读取，会减小内存占用，加大效率。

四、其他的读取方式

1.readline()函数

可以用来读取一行内容，如

file_name = 'test2.txt'
with open(file_name,encoding='utf-8') as file_obj:
print(file_obj.readline(),end='')
print(file_obj.readline(),end='')
print(file_obj.readline(),end='')

打印

暮晓春来迟
先于百花知
岁岁种桃树

2.readlines()函数

逐行读取，读完所有行并将读到的内容封装到列表中返回。

file_name = 'test2.txt'
with open(file_name,encoding='utf-8') as file_obj:
content = file_obj.readlines()
print(content)

打印

['暮晓春来迟\n', '先于百花知\n', '岁岁种桃树\n', '开在断肠时']

，显然是列表的形式，列表中的每个元素是一行内容。

五、文件的写入

write()函数是向文件中写入内容。
如果要操作的是一个文本文件的时候，write()函数需要传入一个字符串作为参数；
同时，open()函数打开文件时需要指定打开文件所需的操作，即mode：
默认是r，即read，表示只读，即默认情况下是不能向文件中写入，要指定

mode='w'

；
w表示可以写入，如果文件不存在则创建文件，存在则覆盖原文件的内容。

file_name = 'test2.txt'
with open(file_name,'w',encoding='utf-8') as file_obj:
file_obj.write('迟日江山丽')

此时

test2.txt

文本内容已经变成

迟日江山丽

，原文内容被覆盖。

file_name = 'test3.txt'
with open(file_name,'w',encoding='utf-8') as file_obj:
file_obj.write('迟日江山丽')

这是生成了一个新文件test3.txt。
同时写入的只能是字符串，不能是整型、浮点型等数值，否则会抛出异常。

file_name = 'test3.txt'
with open(file_name,'w',encoding='utf-8') as file_obj:
file_obj.write(12345)

打印

Traceback (most recent call last):
File "xxx.py", line 29, in <module>
file_obj.write(12345)
TypeError: write() argument must be str, not int

要想不被覆盖，mode参数要用a，表示追加内容（add）：
如果文件不存在则会创建文件，存在则向文件中追加内容。

file_name = 'test2.txt'
with open(file_name,'a',encoding='utf-8') as file_obj:
file_obj.write('黄河入海流')

test2.txt内容变为

迟日江山丽黄河入海流

**+**是向文件增加功能：
r+：既可读又可写，文件不会报错；
w+：既可读又可写。

file_name = 'test2.txt'
with open(file_name,'r+',encoding='utf-8') as file_obj:
file_obj.write('欲穷千里目')

test2.txt内容变为

欲穷千里目黄河入海流

x用来创建文件，如果不存在则创建，存在则报错：

file_name = 'test3.txt'
with open(file_name,'x',encoding='utf-8') as file_obj:
file_obj.write('欲穷千里目')

此时创建test3.txt，
再次运行，报错

Traceback (most recent call last):
File "xxx.py", line 28, in <module>
with open(file_name,'x',encoding='utf-8') as file_obj:
FileExistsError: [Errno 17] File exists: 'test3.txt'

六、二进制文件的读写

对于二进制文件时，如果按上述的方式读取，会报错

file_name = 'logo.png'
with open(file_name,'r',encoding='utf-8') as file_obj:
print(file_obj.read(100))

打印

Traceback (most recent call last):
File "xxx.py", line 29, in <module>
print(file_obj.read(100))
File "xxxx\Python\Python37\lib\codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

读取模式：
t:读取文本文件，默认；
b:读取二进制文件。
读取文本文件按时，size是以字符为单位；
读取二进制文件时，size是以字节为单位。

file_name = 'logo.png'
with open(file_name,'rb') as file_obj:
print(file_obj.read(100))

打印

b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00H\x00\x00\x00H\x08\x06\x00\x00\x00U\xed\xb3G\x00\x00\x00\x01sRGB\x00\xae\xce\x1c\xe9\x00\x00\x00\x04gAMA\x00\x00\xb1\x8f\x0b\xfca\x05\x00\x00\x00\tpHYs\x00\x00\x0e\xc4\x00\x00\x0e\xc4\x01\x95+\x0e\x1b\x00\x00\x0f\xf6IDATx^\xed\\\x8b\x9fUU\x15'

将读取的文件写入新文件：

file_name = 'logo.png'
with open(file_name,'rb') as file_obj:
new_png = 'logo2.png'
#定义每次读取大小
count = 1024 * 100
with open(new_png, 'wb') as new_obj:
while True:
content = file_obj.read(count)
#内容读取完毕，循环终止
if not content:
break
new_obj.write(content)

结果生成了一个新的文件logo2.png，内容与原文件logo2.png相同。

七、读取文件的位置

对于文本文件，如果以二进制方式读取，会返回字节型数据结果

with open('test.txt','rb') as file_obj:
print(file_obj.read())

打印

b'import android.content.Context;\r\nimport android.graphics.Point;\r\nimport android.os.Build;\r\nimport android.view.WindowManager;'

tell()函数检测当前读取到的位置。

with open('test.txt','rb') as file_obj:
print(file_obj.read(5))
print('Now in',file_obj.tell())

打印

b'impor'
Now in 5

seek()函数修改当前读取的位置：
参数默认从0开始；
1表示从当前位置开始；
2表示从最后位置开始

with open('test.txt','rb') as file_obj:
print(file_obj.read(5))
print('Now in',file_obj.tell())file_obj.seek(5)
print(file_obj.read(5))
print('Now in', file_obj.tell())

打印

b'impor'
Now in 5b't and'
Now in 10

加入参数后，

with open('test.txt','rb') as file_obj:
file_obj.seek(3)
print(file_obj.read())
print('Now in', file_obj.tell())

打印

b';'
Now in 125

文本内容为中文时，

with open('test2.txt','r',encoding='utf-8') as file_obj:
file_obj.seek(3)
print(file_obj.read())
print('Now in', file_obj.tell())

打印

晓春来迟
先于百花知
岁岁种桃树
开在断肠时
Now in 66

跳过了第一个字（utf-8编码中，一个中文占3个字节）。

八、文件的其他操作

os模块，主要用来操作目录的。

import os
l = os.listdir()
print(l)

打印

['Demo.py', 'learning_note.md', 'logo.png', 'logo2.png', 'test.txt', 'test2.txt', 'test3.txt']

listdir()函数获取目录结构，参数默认是.，表示当前目录，…表示上一级目录。
os.getcwd()函数获取当前所在的目录。

import os
l = os.getcwd()
print(l)

打印

C:\Users\xxx\16_File

os.chdir()函数切换盘符

import os
os.chdir('c:/')
l = os.getcwd()
print(l)

打印

c:\

os.mkdir()函数用来在当前目录创建目录

import os
os.mkdir('temp')
l = os.listdir()
print(l)

打印

['Demo.py', 'learning_note.md', 'logo.png', 'logo2.png', 'temp', 'test.txt', 'test2.txt', 'test3.txt']

此时多了一个新创建的文件夹temp。
os.rmdir()用来删除目录

import os
os.rmdir('temp')
l = os.listdir()
print(l)

打印

['Demo.py', 'learning_note.md', 'logo.png', 'logo2.png', 'test.txt', 'test2.txt', 'test3.txt']

此时刚才创建的文件夹又被删除。

大家也可以关注我的公众号：Python极客社区，在我的公众号里，经常会分享很多Python的文章，而且也分享了很多工具、学习资源等。另外回复“电子书”还可以获取十本我精心收集的Python电子书。

点赞 2
收藏
分享
文章举报

cupyter 发布了49 篇原创文章 · 获赞 179 · 访问量 2万+ 私信关注

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航