您的位置：首页 > 编程语言 > Python开发

python学习笔记4——解析xml、文件操作

2015-08-20 11:35 603 查看

最近在做船只检测方面的事情，需要大量的负样本来训练adaboost分类器。我从网上下载到一个pascal_voc的数据集，需要找到不包含船只的那些复制出来。

之前用c#写了一个

现在联系用python练习一下

import glob,os,xml.etree.cElementTree as ET,shutil
fatherDir = r'E:\迅雷下载\VOCtrainval_11-May-2012\VOCdevkit\VOC2012'
for filePath in glob.glob(fatherDir+r'\Annotations\*.xml'):
    fileName=os.path.split(filePath)[-1][:-4]#文件名;
    hasBoat=False
    for event, elem in ET.iterparse(filePath):
        if event == 'end':
            if elem.tag == 'name' and elem.text == 'boat':
                hasBoat=True
                elem.clear()
                break;
        elem.clear() # discard the element
    if hasBoat==False:
        shutil.copyfile(fatherDir+r'\JPEGImages'+'\\'+fileName+'.jpg',r'D:\IP_CV_WorkSpace\Img\NegSample2'+'\\'+fileName+'.jpg')
    else:
        hasBoat==False

相关练习

#测试处理xml文件
import xml.etree.cElementTree as ET
count = 0
for event, elem in ET.iterparse('E:/迅雷下载/VOCtrainval_11-May-2012/VOCdevkit/VOC2012/Annotations/2007_000032.xml'):
    if event == 'end':
        if elem.tag == 'object/name' and elem.text == 'person':
            count += 1
    elem.clear() # discard the element

print(count)

获取文件列表
import os
print(os.listdir('E:/迅雷下载/VOCtrainval_11-May-2012/VOCdevkit/VOC2012/Annotations/'))

import glob,os
for filename in glob.glob(r'E:\迅雷下载\VOCtrainval_11-May-2012\VOCdevkit\VOC2012\Annotations\*.xml'):
    print(os.path.split(filename)[-1][:-4])#文件名;

下面的程序用来复制文件;
import shutil
shutil.copyfile(r'D:\IP_CV_WorkSpace\Img\TestBoatResult2\132264_120.jpg', r'D:\1.jpg')

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航