您的位置:首页 > Web前端

caffe学习笔记(十九)-训练和测试自己的数据集

2018-07-30 11:01 543 查看

原文链接:https://www.geek-share.com/detail/2702555726.html

学习caffe后跑了自带的例子是不是感觉很不过瘾,学习caffe的目的不是简单做几个练习,而是要用到自己的实际项目或者科研中,所以本文介绍如何从自己的原始图片到lmdb数据,再到训练和测试模型的整个流程。

 

详细代码可以去我的github上https://github.com/EddyGao/Caffe-taobao_Image-Identification参考,下载myfile4放到你的caffe/examples/下面,执行demo.sh即可。

一.准备数据

1)我们借用网上某童鞋的数据集,来自于淘宝10个商品类1000张图片,每个类100张,当然你也可以根据自己的需要搜集自己想要识别的图片集

下载地址,百度网盘:https://pan.baidu.com/s/1o77w1wI

下载后将文件解压。

2)在caffe/examples文件夹下新建文件夹myfile4,并在myfile4中新建文件夹data用于存放我们的数据集,将下载的数据集中train和val文件复制到data中即

caffe/examples/myfile4/data/train

caffe/examples/myfile4/data/val

二.转换为lmdb格式

1)在myfile4文件夹中新建create_filelist.sh

 

[code] 
  1. #!/usr/bin/env sh

  2. DATA=examples/myfile4/data

  3. MY=examples/myfile4/data

  4.  
  5. echo "Create train.txt..."

  6. rm -rf $MY/train.txt

  7.  
  8. find $DATA/train -name 15001*.jpg | cut -d '/' -f4-5 | sed "s/$/ 0/">>$MY/train.txt

  9. find $DATA/train -name 15059*.jpg | cut -d '/' -f4-5 | sed "s/$/ 1/">>$MY/train.txt

  10. find $DATA/train -name 62047*.jpg | cut -d '/' -f4-5 | sed "s/$/ 2/">>$MY/train.txt

  11. find $DATA/train -name 68021*.jpg | cut -d '/' -f4-5 | sed "s/$/ 3/">>$MY/train.txt

  12. find $DATA/train -name 73018*.jpg | cut -d '/' -f4-5 | sed "s/$/ 4/">>$MY/train.txt

  13. find $DATA/train -name 73063*.jpg | cut -d '/' -f4-5 | sed "s/$/ 5/">>$MY/train.txt

  14. find $DATA/train -name 80012*.jpg | cut -d '/' -f4-5 | sed "s/$/ 6/">>$MY/train.txt

  15. find $DATA/train -name 92002*.jpg | cut -d '/' -f4-5 | sed "s/$/ 7/">>$MY/train.txt

  16. find $DATA/train -name 92017*.jpg | cut -d '/' -f4-5 | sed "s/$/ 8/">>$MY/train.txt

  17. find $DATA/train -name 95005*.jpg | cut -d '/' -f4-5 | sed "s/$/ 9/">>$MY/train.txt

  18.  
  19. echo "Create test.txt..."

  20. rm -rf $MY/val.txt

  21.  
  22. find $DATA/val -name 15001*.jpg | cut -d '/' -f4-5 | sed "s/$/ 0/">>$MY/val.txt

  23. find $DATA/val -name 15059*.jpg | cut -d '/' -f4-5 | sed "s/$/ 1/">>$MY/val.txt

  24. find $DATA/val -name 62047*.jpg | cut -d '/' -f4-5 | sed "s/$/ 2/">>$MY/val.txt

  25. find $DATA/val -name 68021*.jpg | cut -d '/' -f4-5 | sed "s/$/ 3/">>$MY/val.txt

  26. find $DATA/val -name 73018*.jpg | cut -d '/' -f4-5 | sed "s/$/ 4/">>$MY/val.txt

  27. find $DATA/val -name 73063*.jpg | cut -d '/' -f4-5 | sed "s/$/ 5/">>$MY/val.txt

  28. find $DATA/val -name 80012*.jpg | cut -d '/' -f4-5 | sed "s/$/ 6/">>$MY/val.txt

  29. find $DATA/val -name 92002*.jpg | cut -d '/' -f4-5 | sed "s/$/ 7/">>$MY/val.txt

  30. find $DATA/val -name 92017*.jpg | cut -d '/' -f4-5 | sed "s/$/ 8/">>$MY/val.txt

  31. find $DATA/val -name 95005*.jpg | cut -d '/' -f4-5 | sed "s/$/ 9/">>$MY/val.txt

  32.  
  33. echo "All done"

在shell中运行此脚本,在caffe根目录下执行

 

 

[code]# sh examples/myfile4/create_filelist.sh

执行完成后你会发现在examples/myfile4/data文件夹中生成train.txt和val.txt文件

 

2)在myfile4文件夹中新建create_lmdb.sh文件

 

[code] 
  1. #!/usr/bin/env sh

  2. MY=examples/myfile4

  3.  
  4. TRAIN_DATA_ROOT=/home/ghz/caffe/examples/myfile4/data/

  5. VAL_DATA_ROOT=/home/ghz/caffe/examples/myfile4/data/

  6.  
  7. echo "Create train lmdb.."

  8. rm -rf $MY/img_train_lmdb

  9. build/tools/convert_imageset \

  10. --shuffle \

  11. --resize_height=32 \

  12. --resize_width=32 \

  13. $TRAIN_DATA_ROOT \

  14. $MY/data/train.txt \

  15. $MY/img_train_lmdb

  16.  
  17. echo "Create test lmdb.."

  18. rm -rf $MY/img_val_lmdb

  19. build/tools/convert_imageset \

  20. --shuffle \

  21. --resize_height=32 \

  22. --resize_width=32 \

  23. $VAL_DATA_ROOT \

  24. $MY/data/val.txt \

  25. $MY/img_val_lmdb

  26.  
  27. echo "All Done.."

在shell中运行此脚本,在caffe根目录下执行

 

 

[code]# sh examples/myfile4/create_lmdb.sh

执行后会在myfile4文件夹下生成img_train_lmdb文件夹和img_val_lmdb文件夹

 

三.计算均值并保存

myfile4中新建文件create_meanfile.sh

 

[code] 
  1. EXAMPLE=examples/myfile4

  2. DATA=examples/myfile4

  3. TOOLS=build/tools

  4.  
  5. $TOOLS/compute_image_mean $EXAMPLE/img_train_lmdb $DATA/mean.binaryproto

  6.  
  7. echo "Done."

生成mean.binaryproto文件

 

四.创建模型并编写配置文件

在myfile4中创建myfile4_train_test.prototxt文件

 

[code] 
  1. name: "myfile4"

  2. layer {

  3. name: "data"

  4. type: "Data"

  5. top: "data"

  6. top: "label"

  7. include {

  8. phase: TRAIN

  9. }

  10. transform_param {

  11. mean_file: "examples/myfile4/mean.binaryproto"

  12. }

  13. data_param {

  14. source: "examples/myfile4/img_train_lmdb"

  15. batch_size: 50

  16. backend: LMDB

  17. }

  18. }

  19. layer {

  20. name: "cifar"

  21. type: "Data"

  22. top: "data"

  23. top: "label"

  24. include {

  25. phase: TEST

  26. }

  27. transform_param {

  28. mean_file: "examples/myfile4/mean.binaryproto"

  29. }

  30. data_param {

  31. source: "examples/myfile4/img_val_lmdb"

  32. batch_size: 50

  33. backend: LMDB

  34. }

  35. }

  36. layer {

  37. name: "conv1"

  38. type: "Convolution"

  39. bottom: "data"

  40. top: "conv1"

  41. param {

  42. lr_mult: 1

  43. }

  44. param {

  45. lr_mult: 2

  46. }

  47. convolution_param {

  48. num_output: 32

  49. pad: 2

  50. kernel_size: 5

  51. stride: 1

  52. weight_filler {

  53. type: "gaussian"

  54. std: 0.0001

  55. }

  56. bias_filler {

  57. type: "constant"

  58. }

  59. }

  60. }

  61. layer {

  62. name: "pool1"

  63. type: "Pooling"

  64. bottom: "conv1"

  65. top: "pool1"

  66. pooling_param {

  67. pool: MAX

  68. kernel_size: 3

  69. stride: 2

  70. }

  71. }

  72. layer {

  73. name: "relu1"

  74. type: "ReLU"

  75. bottom: "pool1"

  76. top: "pool1"

  77. }

  78. layer {

  79. name: "conv2"

  80. type: "Convolution"

  81. bottom: "pool1"

  82. top: "conv2"

  83. param {

  84. lr_mult: 1

  85. }

  86. param {

  87. lr_mult: 2

  88. }

  89. convolution_param {

  90. num_output: 32

  91. pad: 2

  92. kernel_size: 5

  93. stride: 1

  94. weight_filler {

  95. type: "gaussian"

  96. std: 0.01

  97. }

  98. bias_filler {

  99. type: "constant"

  100. }

  101. }

  102. }

  103. layer {

  104. name: "relu2"

  105. type: "ReLU"

  106. bottom: "conv2"

  107. top: "conv2"

  108. }

  109. layer {

  110. name: "pool2"

  111. type: "Pooling"

  112. bottom: "conv2"

  113. top: "pool2"

  114. pooling_param {

  115. pool: AVE

  116. kernel_size: 3

  117. stride: 2

  118. }

  119. }

  120. layer {

  121. name: "conv3"

  122. type: "Convolution"

  123. bottom: "pool2"

  124. top: "conv3"

  125. param {

  126. lr_mult: 1

  127. }

  128. param {

  129. lr_mult: 2

  130. }

  131. convolution_param {

  132. num_output: 64

  133. pad: 2

  134. kernel_size: 5

  135. stride: 1

  136. weight_filler {

  137. type: "gaussian"

  138. std: 0.01

  139. }

  140. bias_filler {

  141. type: "constant"

  142. }

  143. }

  144. }

  145. layer {

  146. name: "relu3"

  147. type: "ReLU"

  148. bottom: "conv3"

  149. top: "conv3"

  150. }

  151. layer {

  152. name: "pool3"

  153. type: "Pooling"

  154. bottom: "conv3"

  155. top: "pool3"

  156. pooling_param {

  157. pool: AVE

  158. kernel_size: 3

  159. stride: 2

  160. }

  161. }

  162. layer {

  163. name: "ip1"

  164. type: "InnerProduct"

  165. bottom: "pool3"

  166. top: "ip1"

  167. param {

  168. lr_mult: 1

  169. }

  170. param {

  171. lr_mult: 2

  172. }

  173. inner_product_param {

  174. num_output: 64

  175. weight_filler {

  176. type: "gaussian"

  177. std: 0.1

  178. }

  179. bias_filler {

  180. type: "constant"

  181. }

  182. }

  183. }

  184. layer {

  185. name: "ip2"

  186. type: "InnerProduct"

  187. bottom: "ip1"

  188. top: "ip2"

  189. param {

  190. lr_mult: 1

  191. }

  192. param {

  193. lr_mult: 2

  194. }

  195. inner_product_param {

  196. num_output: 10

  197. weight_filler {

  198. type: "gaussian"

  199. std: 0.1

  200. }

  201. bias_filler {

  202. type: "constant"

  203. }

  204. }

  205. }

  206. layer {

  207. name: "accuracy"

  208. type: "Accuracy"

  209. bottom: "ip2"

  210. bottom: "label"

  211. top: "accuracy"

  212. include {

  213. phase: TEST

  214. }

  215. }

  216. layer {

  217. name: "loss"

  218. type: "SoftmaxWithLoss"

  219. bottom: "ip2"

  220. bottom: "label"

  221. top: "loss"

  222. }

创建myfile4_solver.prototxt文件

 

 

[code] 
  1. net: "examples/myfile4/myfile4_train_test.prototxt"

  2. test_iter: 2

  3. test_interval: 50

  4. base_lr: 0.001

  5. lr_policy: "step"

  6. gamma: 0.1

  7. stepsize: 400

  8. momentum: 0.9

  9. weight_decay: 0.004

  10. display: 10

  11. max_iter: 2000

  12. snapshot: 2000

  13. snapshot_prefix: "examples/myfile4/my"

  14. solver_mode: CPU

五.训练和测试

 

在caffe根目录下执行

 

[code]# build/tools/caffe train -solver examples/myfile4/myfile4_solver.prototxt

我们执行了2000此迭代并在迭代最后保存了模型,如果没有错误训练完成后会在文件夹下出现my_iter_2000.caffemodel文件

 

六.用训练好的模型进行分类

1)在myfile4中新建文件synset_words.txt

 

[code] 
  1. biao

  2. fajia

  3. kuzi

  4. xiangzi

  5. yizi

  6. dianshi

  7. suannai

  8. xiangshui

  9. hufupin

  10. xiezi

2)在myfile4中新建文件deploy.prototxt

 

 

[code] 
  1. name: "myfile4"

  2. layer {

  3. name: "data"

  4. type: "Input"

  5. top: "data"

  6. input_param{shape: {dim:1 dim:3 dim:32 dim:32}}

  7. }

  8. layer {

  9. name: "conv1"

  10. type: "Convolution"

  11. bottom: "data"

  12. top: "conv1"

  13. param {

  14. lr_mult: 1

  15. }

  16. param {

  17. lr_mult: 2

  18. }

  19. convolution_param {

  20. num_output: 32

  21. pad: 2

  22. kernel_size: 5

  23. stride: 1

  24. }

  25. }

  26. layer {

  27. name: "pool1"

  28. type: "Pooling"

  29. bottom: "conv1"

  30. top: "pool1"

  31. pooling_param {

  32. pool: MAX

  33. kernel_size: 3

  34. stride: 2

  35. }

  36. }

  37. layer {

  38. name: "relu1"

  39. type: "ReLU"

  40. bottom: "pool1"

  41. top: "pool1"

  42. }

  43. layer {

  44. name: "conv2"

  45. type: "Convolution"

  46. bottom: "pool1"

  47. top: "conv2"

  48. param {

  49. lr_mult: 1

  50. }

  51. param {

  52. lr_mult: 2

  53. }

  54. convolution_param {

  55. num_output: 32

  56. pad: 2

  57. kernel_size: 5

  58. stride: 1

  59. }

  60. }

  61. layer {

  62. name: "relu2"

  63. type: "ReLU"

  64. bottom: "conv2"

  65. top: "conv2"

  66. }

  67. layer {

  68. name: "pool2"

  69. type: "Pooling"

  70. bottom: "conv2"

  71. top: "pool2"

  72. pooling_param {

  73. pool: AVE

  74. kernel_size: 3

  75. stride: 2

  76. }

  77. }

  78. layer {

  79. name: "conv3"

  80. type: "Convolution"

  81. bottom: "pool2"

  82. top: "conv3"

  83. param {

  84. lr_mult: 1

  85. }

  86. param {

  87. lr_mult: 2

  88. }

  89. convolution_param {

  90. num_output: 64

  91. pad: 2

  92. kernel_size: 5

  93. stride: 1

  94. }

  95. }

  96. layer {

  97. name: "relu3"

  98. type: "ReLU"

  99. bottom: "conv3"

  100. top: "conv3"

  101. }

  102. layer {

  103. name: "pool3"

  104. type: "Pooling"

  105. bottom: "conv3"

  106. top: "pool3"

  107. pooling_param {

  108. pool: AVE

  109. kernel_size: 3

  110. stride: 2

  111. }

  112. }

  113. layer {

  114. name: "ip1"

  115. type: "InnerProduct"

  116. bottom: "pool3"

  117. top: "ip1"

  118. param {

  119. lr_mult: 1

  120. }

  121. param {

  122. lr_mult: 2

  123. }

  124. inner_product_param {

  125. num_output: 64

  126. }

  127. }

  128. layer {

  129. name: "ip2"

  130. type: "InnerProduct"

  131. bottom: "ip1"

  132. top: "ip2"

  133. param {

  134. lr_mult: 1

  135. }

  136. param {

  137. lr_mult: 2

  138. }

  139. inner_product_param {

  140. num_output: 10

  141. }

  142. }

  143. layer {

  144. name: "prob"

  145. type: "Softmax"

  146. bottom: "ip2"

  147. top: "prob"

  148. }

 

 

3)在myfile4中新建文件夹images,其中放入你想要分类的图片,比如我的是images/111.jpg,测试图片可以来自于我们下载的图像集中

4)在myfile4中新建文件demo.sh

 

[code]./build/examples/cpp_classification/classification.bin examples/myfile4/deploy.prototxt examples/myfile4/my_iter_2000.caffemodel examples/myfile4/mean.binaryproto examples/myfile4/synset_words.txt examples/myfile4/images/111.jpg

在caffe根目录下执行demo.sh

 

 

[code]# sh examples/myfile4/demo.sh

执行后会在shell显示识别的结果

 

至此,我们完成了Caffe训练和测试自己的数据集,如有问题欢迎交流。

阅读更多
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐