您的位置:首页 > Web前端

Extract CNN features using Caffe

2015-10-11 21:30 771 查看
HerewesummarzeseveralmainstepsinextractingCNNfeaturesusingCaffe,includingtheextractionofLMDBformatfeaturesbythepretrainedAlexNet,andthemethodtoconvertthatLMDBfilesinto.Matfilesforlatermanipulation.

1ExtractingLMDBfiles

Thiscouldbedonebysimplyfollowingtheinstructionsin[1].Howeverwecopyithereforconvenience.

1.1 downloadmodels:
scripts/download_model_binary.p
af3f
ymodels/bvlc_reference_caffenet


1.2selectdata

mkdirexamples/_temp
find`pwd`/examples/images-typef-exececho{}\;>examples/_temp/temp.txt
sed"s/$/0/"examples/_temp/temp.txt>examples/_temp/file_list.txt


1.3
DefinetheFeatureExtractionNetworkArchitecture[/code]
./data/ilsvrc12/get_ilsvrc_aux.sh
cpexamples/feature_extraction/imagenet_val.prototxtexamples/_temp


1.4Extractfeatures
./build/tools/extract_features.binmodels/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodelexamples/_temp/imagenet_val.prototxtfc7examples/_temp/features100lmdb


Thelastbutoneparameterdenotesthemini-batches,andrequiresadjustmentsinpractice.
2 convertlmdbinto.matfiles


OncetheLMDBfeatureisobtained,wecanreadilyconvertitinto.matfiles,usingthefollowingtwosub-routineswritteninpython,whichisamodifiedversionof[2]
:
Thefirstoneisahelperfunctionnamed feat_helper_pb2.py:

<spanstyle="font-size:10px;">#Generatedbytheprotocolbuffercompiler.DONOTEDIT!

fromgoogle.protobufimportdescriptor
fromgoogle.protobufimportmessage
fromgoogle.protobufimportreflection
fromgoogle.protobufimportdescriptor_pb2

#@@protoc_insertion_point(imports)

DESCRIPTOR=descriptor.FileDescriptor(
name='datum.proto',
package='feat_extract',
serialized_pb='\n\x0b\x64\x61tum.proto\x12\x0c\x66\x65\x61t_extract\"i\n\x05\x44\x61tum\x12\x10\n\x08\x63hannels\x18\x01\x01(\x05\x12\x0e\n\x06height\x18\x02\x01(\x05\x12\r\n\x05width\x18\x03\x01(\x05\x12\x0c\n\x04\x64\x61ta\x18\x04\x01(\x0c\x12\r\n\x05label\x18\x05\x01(\x05\x12\x12\n\nfloat_data\x18\x06\x03(\x02')

_DATUM=descriptor.Descriptor(
name='Datum',
full_name='feat_extract.Datum',
filename=None,
file=DESCRIPTOR,
containing_type=None,
fields=[
descriptor.FieldDescriptor(
name='channels',full_name='feat_extract.Datum.channels',index=0,
number=1,type=5,cpp_type=1,label=1,
has_default_value=False,default_value=0,
message_type=None,enum_type=None,containing_type=None,
is_extension=False,extension_scope=None,
options=None),
descriptor.FieldDescriptor(
name='height',full_name='feat_extract.Datum.height',index=1,
number=2,type=5,cpp_type=1,label=1,
has_default_value=False,default_value=0,
message_type=None,enum_type=None,containing_type=None,
is_extension=False,extension_scope=None,
options=None),
descriptor.FieldDescriptor(
name='width',full_name='feat_extract.Datum.width',index=2,
number=3,type=5,cpp_type=1,label=1,
has_default_value=False,default_value=0,
message_type=None,enum_type=None,containing_type=None,
is_extension=False,extension_scope=None,
options=None),
descriptor.FieldDescriptor(
name='data',full_name='feat_extract.Datum.data',index=3,
number=4,type=12,cpp_type=9,label=1,
has_default_value=False,default_value="",
message_type=None,enum_type=None,containing_type=None,
is_extension=False,extension_scope=None,
options=None),
descriptor.FieldDescriptor(
name='label',full_name='feat_extract.Datum.label',index=4,
number=5,type=5,cpp_type=1,label=1,
has_default_value=False,default_value=0,
message_type=None,enum_type=None,containing_type=None,
is_extension=False,extension_scope=None,
options=None),
descriptor.FieldDescriptor(
name='float_data',full_name='feat_extract.Datum.float_data',index=5,
number=6,type=2,cpp_type=6,label=3,
has_default_value=False,default_value=[],
message_type=None,enum_type=None,containing_type=None,
is_extension=False,extension_scope=None,
options=None),
],
extensions=[
],
nested_types=[],
enum_types=[
],
options=None,
is_extendable=False,
extension_ranges=[],
serialized_start=29,
serialized_end=134,
)

DESCRIPTOR.message_types_by_name['Datum']=_DATUM

classDatum(message.Message):
__metaclass__=reflection.GeneratedProtocolMessageType
DESCRIPTOR=_DATUM

#@@protoc_insertion_point(class_scope:feat_extract.Datum)

#@@protoc_insertion_point(module_scope)</span>

Thenfollowstherealfunctionforconversion, lmdb2mat.py,inthesamedirectory
asthehelperfunction:

<spanstyle="font-size:10px;">importlmdb
importsys
sys.path.append('/usr/lib/python2.7/dist-packages')
importfeat_helper_pb2
importnumpyasnp
importscipy.ioassio
importtime

defmain(argv):
lmdb_name=sys.argv[1]
print"%s"%sys.argv[1]
batch_num=int(sys.argv[2]);
batch_size=int(sys.argv[3]);
window_num=batch_num*batch_size;

start=time.time()
if'db'notinlocals().keys():
db=lmdb.open(lmdb_name)
txn=db.begin()
cursor=txn.cursor()
cursor.iternext()
datum=feat_helper_pb2.Datum()

keys=[]
values=[]
forkey,valueinenumerate(cursor.iternext_nodup()):
keys.append(key)
values.append(cursor.value())

ft=np.zeros((window_num,int(sys.argv[4])))
forim_idxinrange(window_num):
datum.ParseFromString(values[im_idx])
ft[im_idx,:]=datum.float_data

print'time1:%f'%(time.time()-start)
sio.savemat(sys.argv[5],{'feats':ft})
print'time2:%f'%(time.time()-start)
print'done!'

if__name__=='__main__':
importsys
main(sys.argv)
</span>

Finally,westillneedabashfiletocalltheabovetwosub-routines,asfollows:

#!/usr/bin/envsh

LMDB=_temp/features
BATCHNUM=50
BATCHSIZE=100

#DIM=290400 

#DIM=43264#conv5

DIM=4096
OUT=_temp/features.mat
python./lmdb2mat.py$LMDB$BATCHNUM$BATCHSIZE$DIM$OUT


Citations
[1] http://caffe.berkeleyvision.org/gathered/examples/feature_extraction.html
[2] http://blog.csdn.net/lijiancheng0614/article/details/48180331
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  Caffe cnn