您的位置:首页 > 其它

解决利用keras的InceptionV3、ResNet50模型做迁移学习训练集和验证集结果相差很大的问题

2019-01-12 12:54 465 查看
版权声明:本人原创,转载请注明出处 https://blog.csdn.net/zjn295771349/article/details/86355874

kaggle的人类蛋白图谱图像分类的比赛告一段落了,终于有时间闲下来写写这一路走来填的坑了。

keras的版本是2.2.4

有没有小伙伴遇到过用keras的InceptionV3、ResNet50等含有BN层的模型做迁移学习训练集和验证集结果相差很大的问题,例如下面这样:

[code]Epoch 1/20
1500/1500 [==============================] - 24s 16ms/step - loss: 2.1168 - binary_accuracy: 0.9169 - f1_keras: 0.0617 - val_loss: 2.2727 - val_binary_accuracy: 0.9258 - val_f1_keras: 0.0377
Epoch 2/20
1500/1500 [==============================] - 19s 13ms/step - loss: 1.1976 - binary_accuracy: 0.9480 - f1_keras: 0.1084 - val_loss: 2.4163 - val_binary_accuracy: 0.9218 - val_f1_keras: 0.0356
Epoch 3/20
1500/1500 [==============================] - 19s 13ms/step - loss: 0.9935 - binary_accuracy: 0.9540 - f1_keras: 0.1608 - val_loss: 2.7485 - val_binary_accuracy: 0.9114 - val_f1_keras: 0.0359
Epoch 4/20
1500/1500 [==============================] - 19s 13ms/step - loss: 0.8294 - binary_accuracy: 0.9572 - f1_keras: 0.1902 - val_loss: 2.9039 - val_binary_accuracy: 0.9166 - val_f1_keras: 0.0402
Epoch 5/20
1500/1500 [==============================] - 19s 13ms/step - loss: 0.7250 - binary_accuracy: 0.9606 - f1_keras: 0.2482 - val_loss: 3.1574 - val_binary_accuracy: 0.9057 - val_f1_keras: 0.0485

可以看出,模型的训练集loss在一直减小,但是验证集的loss却一直增大,而且验证集的准确率和f1分数也与训练集的结果大相径庭。有小伙伴会怀疑会不会是过拟合了,楼主也曾这样怀疑过,所以楼主将验证集用训练集代替又做了次实验,也就是说训练集和验证集都是相同的样本集,这样一来得到的预期结果应该是训练集和验证集的结果都应该相同才对。但是却得到了跟上面几乎相同的结果。

楼主又用Vgg-19模型代替InceptionV3做了相同的实验,Vgg-19等不含有BN层的模型并未出现此问题。因此楼主怀疑是BN层搞得鬼,通过查找资料发现问题出在了建造模型的代码上。先给出错误的模型建造的代码(我个人的愚见,若我讲的不对,希望大神能够指出),下面的代码是keras官方给出的,楼主上面的结果就是用这个建造模型的代码结构(结构是一样的,内容稍有差别)跑出来的。

[code]from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K

# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)

# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(200, activation='softmax')(x)

# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)

# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
layer.trainable = False

# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')

# train the model on the new data for a few epochs
model.fit_generator(...)

运行下model.summary()看一下模型结构:

[code]activation_20 (Activation)      (None, None, None, 6 0           batch_normalization_20[0][0]
__________________________________________________________________________________________________
activation_22 (Activation)      (None, None, None, 6 0           batch_normalization_22[0][0]
__________________________________________________________________________________________________
activation_25 (Activation)      (None, None, None, 9 0           batch_normalization_25[0][0]
__________________________________________________________________________________________________
activation_26 (Activation)      (None, None, None, 6 0           batch_normalization_26[0][0]
__________________________________________________________________________________________________
mixed2 (Concatenate)            (None, None, None, 2 0           activation_20[0][0]
activation_22[0][0]
activation_25[0][0]
activation_26[0][0]
__________________________________________________________________________________________________
conv2d_28 (Conv2D)              (None, None, None, 6 18432       mixed2[0][0]
__________________________________________________________________________________________________
batch_normalization_28 (BatchNo (None, None, None, 6 192         conv2d_28[0][0]
__________________________________________________________________________________________________
activation_28 (Activation)      (None, None, None, 6 0           batch_normalization_28[0][0]
__________________________________________________________________________________________________
conv2d_29 (Conv2D)              (None, None, None, 9 55296       activation_28[0][0]
__________________________________________________________________________________________________
batch_normalization_29 (BatchNo (None, None, None, 9 288         conv2d_29[0][0]
__________________________________________________________________________________________________
activation_29 (Activation)      (None, None, None, 9 0           batch_normalization_29[0][0]
__________________________________________________________________________________________________
conv2d_27 (Conv2D)              (None, None, None, 3 995328      mixed2[0][0]
__________________________________________________________________________________________________
conv2d_30 (Conv2D)              (None, None, None, 9 82944       activation_29[0][0]
__________________________________________________________________________________________________
batch_normalization_27 (BatchNo (None, None, None, 3 1152        conv2d_27[0][0]
__________________________________________________________________________________________________
batch_normalization_30 (BatchNo (None, None, None, 9 288         conv2d_30[0][0]
__________________________________________________________________________________________________
activation_27 (Activation)      (None, None, None, 3 0           batch_normalization_27[0][0]
__________________________________________________________________________________________________
activation_30 (Activation)      (None, None, None, 9 0           batch_normalization_30[0][0]
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D)  (None, None, None, 2 0           mixed2[0][0]
__________________________________________________________________________________________________
mixed3 (Concatenate)            (None, None, None, 7 0           activation_27[0][0]
activation_30[0][0]
max_pooling2d_3[0][0]
__________________________________________________________________________________________________
conv2d_35 (Conv2D)              (None, None, None, 1 98304       mixed3[0][0]
__________________________________________________________________________________________________
batch_normalization_35 (BatchNo (None, None, None, 1 384         conv2d_35[0][0]
__________________________________________________________________________________________________
activation_35 (Activation)      (None, None, None, 1 0           batch_normalization_35[0][0]
__________________________________________________________________________________________________
conv2d_36 (Conv2D)              (None, None, None, 1 114688      activation_35[0][0]
__________________________________________________________________________________________________
batch_normalization_36 (BatchNo (None, None, None, 1 384         conv2d_36[0][0]
__________________________________________________________________________________________________
activation_36 (Activation)      (None, None, None, 1 0           batch_normalization_36[0][0]
__________________________________________________________________________________________________
conv2d_32 (Conv2D)              (None, None, None, 1 98304       mixed3[0][0]
__________________________________________________________________________________________________
conv2d_37 (Conv2D)              (None, None, None, 1 114688      activation_36[0][0]
__________________________________________________________________________________________________
batch_normalization_32 (BatchNo (None, None, None, 1 384         conv2d_32[0][0]
__________________________________________________________________________________________________
batch_normalization_37 (BatchNo (None, None, None, 1 384         conv2d_37[0][0]
__________________________________________________________________________________________________
activation_32 (Activation)      (None, None, None, 1 0           batch_normalization_32[0][0]
__________________________________________________________________________________________________
activation_37 (Activation)      (None, None, None, 1 0           batch_normalization_37[0][0]
__________________________________________________________________________________________________
conv2d_33 (Conv2D)              (None, None, None, 1 114688      activation_32[0][0]
__________________________________________________________________________________________________
conv2d_38 (Conv2D)              (None, None, None, 1 114688      activation_37[0][0]
__________________________________________________________________________________________________
batch_normalization_33 (BatchNo (None, None, None, 1 384         conv2d_33[0][0]
__________________________________________________________________________________________________
batch_normalization_38 (BatchNo (None, None, None, 1 384         conv2d_38[0][0]
__________________________________________________________________________________________________
activation_33 (Activation)      (None, None, None, 1 0           batch_normalization_33[0][0]
__________________________________________________________________________________________________
activation_38 (Activation)      (None, None, None, 1 0           batch_normalization_38[0][0]
__________________________________________________________________________________________________
average_pooling2d_4 (AveragePoo (None, None, None, 7 0           mixed3[0][0]
__________________________________________________________________________________________________
conv2d_31 (Conv2D)              (None, None, None, 1 147456      mixed3[0][0]
__________________________________________________________________________________________________
conv2d_34 (Conv2D)              (None, None, None, 1 172032      activation_33[0][0]
__________________________________________________________________________________________________
conv2d_39 (Conv2D)              (None, None, None, 1 172032      activation_38[0][0]
__________________________________________________________________________________________________
conv2d_40 (Conv2D)              (None, None, None, 1 147456      average_pooling2d_4[0][0]
__________________________________________________________________________________________________
batch_normalization_31 (BatchNo (None, None, None, 1 576         conv2d_31[0][0]
__________________________________________________________________________________________________
batch_normalization_34 (BatchNo (None, None, None, 1 576         conv2d_34[0][0]
__________________________________________________________________________________________________
batch_normalization_39 (BatchNo (None, None, None, 1 576         conv2d_39[0][0]
__________________________________________________________________________________________________
batch_normalization_40 (BatchNo (None, None, None, 1 576         conv2d_40[0][0]
__________________________________________________________________________________________________
activation_31 (Activation)      (None, None, None, 1 0           batch_normalization_31[0][0]
__________________________________________________________________________________________________
activation_34 (Activation)      (None, None, None, 1 0           batch_normalization_34[0][0]
__________________________________________________________________________________________________
activation_39 (Activation)      (None, None, None, 1 0           batch_normalization_39[0][0]
__________________________________________________________________________________________________
activation_40 (Activation)      (None, None, None, 1 0           batch_normalization_40[0][0]
__________________________________________________________________________________________________
mixed4 (Concatenate)            (None, None, None, 7 0           activation_31[0][0]
activation_34[0][0]
activation_39[0][0]
activation_40[0][0]
__________________________________________________________________________________________________
conv2d_45 (Conv2D)              (None, None, None, 1 122880      mixed4[0][0]
__________________________________________________________________________________________________
batch_normalization_45 (BatchNo (None, None, None, 1 480         conv2d_45[0][0]
__________________________________________________________________________________________________
activation_45 (Activation)      (None, None, None, 1 0           batch_normalization_45[0][0]
__________________________________________________________________________________________________
conv2d_46 (Conv2D)              (None, None, None, 1 179200      activation_45[0][0]
__________________________________________________________________________________________________
batch_normalization_46 (BatchNo (None, None, None, 1 480         conv2d_46[0][0]
__________________________________________________________________________________________________
activation_46 (Activation)      (None, None, None, 1 0           batch_normalization_46[0][0]
__________________________________________________________________________________________________
conv2d_42 (Conv2D)              (None, None, None, 1 122880      mixed4[0][0]
__________________________________________________________________________________________________
conv2d_47 (Conv2D)              (None, None, None, 1 179200      activation_46[0][0]
__________________________________________________________________________________________________
batch_normalization_42 (BatchNo (None, None, None, 1 480         conv2d_42[0][0]
__________________________________________________________________________________________________
batch_normalization_47 (BatchNo (None, None, None, 1 480         conv2d_47[0][0]
__________________________________________________________________________________________________
activation_42 (Activation)      (None, None, None, 1 0           batch_normalization_42[0][0]
__________________________________________________________________________________________________
activation_47 (Activation)      (None, None, None, 1 0           batch_normalization_47[0][0]
__________________________________________________________________________________________________
conv2d_43 (Conv2D)              (None, None, None, 1 179200      activation_42[0][0]
__________________________________________________________________________________________________
conv2d_48 (Conv2D)              (None, None, None, 1 179200      activation_47[0][0]
__________________________________________________________________________________________________
batch_normalization_43 (BatchNo (None, None, None, 1 480         conv2d_43[0][0]
__________________________________________________________________________________________________
batch_normalization_48 (BatchNo (None, None, None, 1 480         conv2d_48[0][0]
__________________________________________________________________________________________________
activation_43 (Activation)      (None, None, None, 1 0           batch_normalization_43[0][0]
__________________________________________________________________________________________________
activation_48 (Activation)      (None, None, None, 1 0           batch_normalization_48[0][0]
__________________________________________________________________________________________________
average_pooling2d_5 (AveragePoo (None, None, None, 7 0           mixed4[0][0]
__________________________________________________________________________________________________
conv2d_41 (Conv2D)              (None, None, None, 1 147456      mixed4[0][0]
__________________________________________________________________________________________________
conv2d_44 (Conv2D)              (None, None, None, 1 215040      activation_43[0][0]
__________________________________________________________________________________________________
conv2d_49 (Conv2D)              (None, None, None, 1 215040      activation_48[0][0]
__________________________________________________________________________________________________
conv2d_50 (Conv2D)              (None, None, None, 1 147456      average_pooling2d_5[0][0]
__________________________________________________________________________________________________
batch_normalization_41 (BatchNo (None, None, None, 1 576         conv2d_41[0][0]
__________________________________________________________________________________________________
batch_normalization_44 (BatchNo (None, None, None, 1 576         conv2d_44[0][0]
__________________________________________________________________________________________________
batch_normalization_49 (BatchNo (None, None, None, 1 576         conv2d_49[0][0]
__________________________________________________________________________________________________
batch_normalization_50 (BatchNo (None, None, None, 1 576         conv2d_50[0][0]
__________________________________________________________________________________________________
activation_41 (Activation)      (None, None, None, 1 0           batch_normalization_41[0][0]
__________________________________________________________________________________________________
activation_44 (Activation)      (None, None, None, 1 0           batch_normalization_44[0][0]
__________________________________________________________________________________________________
activation_49 (Activation)      (None, None, None, 1 0           batch_normalization_49[0][0]
__________________________________________________________________________________________________
activation_50 (Activation)      (None, None, None, 1 0           batch_normalization_50[0][0]
__________________________________________________________________________________________________
mixed5 (Concatenate)            (None, None, None, 7 0           activation_41[0][0]
activation_44[0][0]
activation_49[0][0]
activation_50[0][0]
__________________________________________________________________________________________________
conv2d_55 (Conv2D)              (None, None, None, 1 122880      mixed5[0][0]
__________________________________________________________________________________________________
batch_normalization_55 (BatchNo (None, None, None, 1 480         conv2d_55[0][0]
__________________________________________________________________________________________________
activation_55 (Activation)      (None, None, None, 1 0           batch_normalization_55[0][0]
__________________________________________________________________________________________________
conv2d_56 (Conv2D)              (None, None, None, 1 179200      activation_55[0][0]
__________________________________________________________________________________________________
batch_normalization_56 (BatchNo (None, None, None, 1 480         conv2d_56[0][0]
__________________________________________________________________________________________________
activation_56 (Activation)      (None, None, None, 1 0           batch_normalization_56[0][0]
__________________________________________________________________________________________________
conv2d_52 (Conv2D)              (None, None, None, 1 122880      mixed5[0][0]
__________________________________________________________________________________________________
conv2d_57 (Conv2D)              (None, None, None, 1 179200      activation_56[0][0]
__________________________________________________________________________________________________
batch_normalization_52 (BatchNo (None, None, None, 1 480         conv2d_52[0][0]
__________________________________________________________________________________________________
batch_normalization_57 (BatchNo (None, None, None, 1 480         conv2d_57[0][0]
__________________________________________________________________________________________________
activation_52 (Activation)      (None, None, None, 1 0           batch_normalization_52[0][0]
__________________________________________________________________________________________________
activation_57 (Activation)      (None, None, None, 1 0           batch_normalization_57[0][0]
__________________________________________________________________________________________________
conv2d_53 (Conv2D)              (None, None, None, 1 179200      activation_52[0][0]
__________________________________________________________________________________________________
conv2d_58 (Conv2D)              (None, None, None, 1 179200      activation_57[0][0]
__________________________________________________________________________________________________
batch_normalization_53 (BatchNo (None, None, None, 1 480         conv2d_53[0][0]
__________________________________________________________________________________________________
batch_normalization_58 (BatchNo (None, None, None, 1 480         conv2d_58[0][0]
__________________________________________________________________________________________________
activation_53 (Activation)      (None, None, None, 1 0           batch_normalization_53[0][0]
__________________________________________________________________________________________________
activation_58 (Activation)      (None, None, None, 1 0           batch_normalization_58[0][0]
__________________________________________________________________________________________________
average_pooling2d_6 (AveragePoo (None, None, None, 7 0           mixed5[0][0]
__________________________________________________________________________________________________
conv2d_51 (Conv2D)              (None, None, None, 1 147456      mixed5[0][0]
__________________________________________________________________________________________________
conv2d_54 (Conv2D)              (None, None, None, 1 215040      activation_53[0][0]
__________________________________________________________________________________________________
conv2d_59 (Conv2D)              (None, None, None, 1 215040      activation_58[0][0]
__________________________________________________________________________________________________
conv2d_60 (Conv2D)              (None, None, None, 1 147456      average_pooling2d_6[0][0]
__________________________________________________________________________________________________
batch_normalization_51 (BatchNo (None, None, None, 1 576         conv2d_51[0][0]
__________________________________________________________________________________________________
batch_normalization_54 (BatchNo (None, None, None, 1 576         conv2d_54[0][0]
__________________________________________________________________________________________________
batch_normalization_59 (BatchNo (None, None, None, 1 576         conv2d_59[0][0]
__________________________________________________________________________________________________
batch_normalization_60 (BatchNo (None, None, None, 1 576         conv2d_60[0][0]
__________________________________________________________________________________________________
activation_51 (Activation)      (None, None, None, 1 0           batch_normalization_51[0][0]
__________________________________________________________________________________________________
activation_54 (Activation)      (None, None, None, 1 0           batch_normalization_54[0][0]
__________________________________________________________________________________________________
activation_59 (Activation)      (None, None, None, 1 0           batch_normalization_59[0][0]
__________________________________________________________________________________________________
activation_60 (Activation)      (None, None, None, 1 0           batch_normalization_60[0][0]
__________________________________________________________________________________________________
mixed6 (Concatenate)            (None, None, None, 7 0           activation_51[0][0]
activation_54[0][0]
activation_59[0][0]
activation_60[0][0]
__________________________________________________________________________________________________
conv2d_65 (Conv2D)              (None, None, None, 1 147456      mixed6[0][0]
__________________________________________________________________________________________________
batch_normalization_65 (BatchNo (None, None, None, 1 576         conv2d_65[0][0]
__________________________________________________________________________________________________
activation_65 (Activation)      (None, None, None, 1 0           batch_normalization_65[0][0]
__________________________________________________________________________________________________
conv2d_66 (Conv2D)              (None, None, None, 1 258048      activation_65[0][0]
__________________________________________________________________________________________________
batch_normalization_66 (BatchNo (None, None, None, 1 576         conv2d_66[0][0]
__________________________________________________________________________________________________
activation_66 (Activation)      (None, None, None, 1 0           batch_normalization_66[0][0]
__________________________________________________________________________________________________
conv2d_62 (Conv2D)              (None, None, None, 1 147456      mixed6[0][0]
__________________________________________________________________________________________________
conv2d_67 (Conv2D)              (None, None, None, 1 258048      activation_66[0][0]
__________________________________________________________________________________________________
batch_normalization_62 (BatchNo (None, None, None, 1 576         conv2d_62[0][0]
__________________________________________________________________________________________________
batch_normalization_67 (BatchNo (None, None, None, 1 576         conv2d_67[0][0]
__________________________________________________________________________________________________
activation_62 (Activation)      (None, None, None, 1 0           batch_normalization_62[0][0]
__________________________________________________________________________________________________
activation_67 (Activation)      (None, None, None, 1 0           batch_normalization_67[0][0]
__________________________________________________________________________________________________
conv2d_63 (Conv2D)              (None, None, None, 1 258048      activation_62[0][0]
__________________________________________________________________________________________________
conv2d_68 (Conv2D)              (None, None, None, 1 258048      activation_67[0][0]
__________________________________________________________________________________________________
batch_normalization_63 (BatchNo (None, None, None, 1 576         conv2d_63[0][0]
__________________________________________________________________________________________________
batch_normalization_68 (BatchNo (None, None, None, 1 576         conv2d_68[0][0]
__________________________________________________________________________________________________
activation_63 (Activation)      (None, None, None, 1 0           batch_normalization_63[0][0]
__________________________________________________________________________________________________
activation_68 (Activation)      (None, None, None, 1 0           batch_normalization_68[0][0]
__________________________________________________________________________________________________
average_pooling2d_7 (AveragePoo (None, None, None, 7 0           mixed6[0][0]
__________________________________________________________________________________________________
conv2d_61 (Conv2D)              (None, None, None, 1 147456      mixed6[0][0]
__________________________________________________________________________________________________
conv2d_64 (Conv2D)              (None, None, None, 1 258048      activation_63[0][0]
__________________________________________________________________________________________________
conv2d_69 (Conv2D)              (None, None, None, 1 258048      activation_68[0][0]
__________________________________________________________________________________________________
conv2d_70 (Conv2D)              (None, None, None, 1 147456      average_pooling2d_7[0][0]
__________________________________________________________________________________________________
batch_normalization_61 (BatchNo (None, None, None, 1 576         conv2d_61[0][0]
__________________________________________________________________________________________________
batch_normalization_64 (BatchNo (None, None, None, 1 576         conv2d_64[0][0]
__________________________________________________________________________________________________
batch_normalization_69 (BatchNo (None, None, None, 1 576         conv2d_69[0][0]
__________________________________________________________________________________________________
batch_normalization_70 (BatchNo (None, None, None, 1 576         conv2d_70[0][0]
__________________________________________________________________________________________________
activation_61 (Activation)      (None, None, None, 1 0           batch_normalization_61[0][0]
__________________________________________________________________________________________________
activation_64 (Activation)      (None, None, None, 1 0           batch_normalization_64[0][0]
__________________________________________________________________________________________________
activation_69 (Activation)      (None, None, None, 1 0           batch_normalization_69[0][0]
__________________________________________________________________________________________________
activation_70 (Activation)      (None, None, None, 1 0           batch_normalization_70[0][0]
__________________________________________________________________________________________________
mixed7 (Concatenate)            (None, None, None, 7 0           activation_61[0][0]
activation_64[0][0]
activation_69[0][0]
activation_70[0][0]
__________________________________________________________________________________________________
conv2d_73 (Conv2D)              (None, None, None, 1 147456      mixed7[0][0]
__________________________________________________________________________________________________
batch_normalization_73 (BatchNo (None, None, None, 1 576         conv2d_73[0][0]
__________________________________________________________________________________________________
activation_73 (Activation)      (None, None, None, 1 0           batch_normalization_73[0][0]
__________________________________________________________________________________________________
conv2d_74 (Conv2D)              (None, None, None, 1 258048      activation_73[0][0]
__________________________________________________________________________________________________
batch_normalization_74 (BatchNo (None, None, None, 1 576         conv2d_74[0][0]
__________________________________________________________________________________________________
activation_74 (Activation)      (None, None, None, 1 0           batch_normalization_74[0][0]
__________________________________________________________________________________________________
conv2d_71 (Conv2D)              (None, None, None, 1 147456      mixed7[0][0]
__________________________________________________________________________________________________
conv2d_75 (Conv2D)              (None, None, None, 1 258048      activation_74[0][0]
__________________________________________________________________________________________________
batch_normalization_71 (BatchNo (None, None, None, 1 576         conv2d_71[0][0]
__________________________________________________________________________________________________
batch_normalization_75 (BatchNo (None, None, None, 1 576         conv2d_75[0][0]
__________________________________________________________________________________________________
activation_71 (Activation)      (None, None, None, 1 0           batch_normalization_71[0][0]
__________________________________________________________________________________________________
activation_75 (Activation)      (None, None, None, 1 0           batch_normalization_75[0][0]
__________________________________________________________________________________________________
conv2d_72 (Conv2D)              (None, None, None, 3 552960      activation_71[0][0]
__________________________________________________________________________________________________
conv2d_76 (Conv2D)              (None, None, None, 1 331776      activation_75[0][0]
__________________________________________________________________________________________________
batch_normalization_72 (BatchNo (None, None, None, 3 960         conv2d_72[0][0]
__________________________________________________________________________________________________
batch_normalization_76 (BatchNo (None, None, None, 1 576         conv2d_76[0][0]
__________________________________________________________________________________________________
activation_72 (Activation)      (None, None, None, 3 0           batch_normalization_72[0][0]
__________________________________________________________________________________________________
activation_76 (Activation)      (None, None, None, 1 0           batch_normalization_76[0][0]
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D)  (None, None, None, 7 0           mixed7[0][0]
__________________________________________________________________________________________________
mixed8 (Concatenate)            (None, None, None, 1 0           activation_72[0][0]
activation_76[0][0]
max_pooling2d_4[0][0]
__________________________________________________________________________________________________
conv2d_81 (Conv2D)              (None, None, None, 4 573440      mixed8[0][0]
__________________________________________________________________________________________________
batch_normalization_81 (BatchNo (None, None, None, 4 1344        conv2d_81[0][0]
__________________________________________________________________________________________________
activation_81 (Activation)      (None, None, None, 4 0           batch_normalization_81[0][0]
__________________________________________________________________________________________________
conv2d_78 (Conv2D)              (None, None, None, 3 491520      mixed8[0][0]
__________________________________________________________________________________________________
conv2d_82 (Conv2D)              (None, None, None, 3 1548288     activation_81[0][0]
__________________________________________________________________________________________________
batch_normalization_78 (BatchNo (None, None, None, 3 1152        conv2d_78[0][0]
__________________________________________________________________________________________________
batch_normalization_82 (BatchNo (None, None, None, 3 1152        conv2d_82[0][0]
__________________________________________________________________________________________________
activation_78 (Activation)      (None, None, None, 3 0           batch_normalization_78[0][0]
__________________________________________________________________________________________________
activation_82 (Activation)      (None, None, None, 3 0           batch_normalization_82[0][0]
__________________________________________________________________________________________________
conv2d_79 (Conv2D)              (None, None, None, 3 442368      activation_78[0][0]
__________________________________________________________________________________________________
conv2d_80 (Conv2D)              (None, None, None, 3 442368      activation_78[0][0]
__________________________________________________________________________________________________
conv2d_83 (Conv2D)              (None, None, None, 3 442368      activation_82[0][0]
__________________________________________________________________________________________________
conv2d_84 (Conv2D)              (None, None, None, 3 442368      activation_82[0][0]
__________________________________________________________________________________________________
average_pooling2d_8 (AveragePoo (None, None, None, 1 0           mixed8[0][0]
__________________________________________________________________________________________________
conv2d_77 (Conv2D)              (None, None, None, 3 409600      mixed8[0][0]
__________________________________________________________________________________________________
batch_normalization_79 (BatchNo (None, None, None, 3 1152        conv2d_79[0][0]
__________________________________________________________________________________________________
batch_normalization_80 (BatchNo (None, None, None, 3 1152        conv2d_80[0][0]
__________________________________________________________________________________________________
batch_normalization_83 (BatchNo (None, None, None, 3 1152        conv2d_83[0][0]
__________________________________________________________________________________________________
batch_normalization_84 (BatchNo (None, None, None, 3 1152        conv2d_84[0][0]
__________________________________________________________________________________________________
conv2d_85 (Conv2D)              (None, None, None, 1 245760      average_pooling2d_8[0][0]
__________________________________________________________________________________________________
batch_normalization_77 (BatchNo (None, None, None, 3 960         conv2d_77[0][0]
__________________________________________________________________________________________________
activation_79 (Activation)      (None, None, None, 3 0           batch_normalization_79[0][0]
__________________________________________________________________________________________________
activation_80 (Activation)      (None, None, None, 3 0           batch_normalization_80[0][0]
__________________________________________________________________________________________________
activation_83 (Activation)      (None, None, None, 3 0           batch_normalization_83[0][0]
__________________________________________________________________________________________________
activation_84 (Activation)      (None, None, None, 3 0           batch_normalization_84[0][0]
__________________________________________________________________________________________________
batch_normalization_85 (BatchNo (None, None, None, 1 576         conv2d_85[0][0]
__________________________________________________________________________________________________
activation_77 (Activation)      (None, None, None, 3 0           batch_normalization_77[0][0]
__________________________________________________________________________________________________
mixed9_0 (Concatenate)          (None, None, None, 7 0           activation_79[0][0]
activation_80[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, None, None, 7 0           activation_83[0][0]
activation_84[0][0]
__________________________________________________________________________________________________
activation_85 (Activation)      (None, None, None, 1 0           batch_normalization_85[0][0]
__________________________________________________________________________________________________
mixed9 (Concatenate)            (None, None, None, 2 0           activation_77[0][0]
mixed9_0[0][0]
concatenate_1[0][0]
activation_85[0][0]
__________________________________________________________________________________________________
conv2d_90 (Conv2D)              (None, None, None, 4 917504      mixed9[0][0]
__________________________________________________________________________________________________
batch_normalization_90 (BatchNo (None, None, None, 4 1344        conv2d_90[0][0]
__________________________________________________________________________________________________
activation_90 (Activation)      (None, None, None, 4 0           batch_normalization_90[0][0]
__________________________________________________________________________________________________
conv2d_87 (Conv2D)              (None, None, None, 3 786432      mixed9[0][0]
__________________________________________________________________________________________________
conv2d_91 (Conv2D)              (None, None, None, 3 1548288     activation_90[0][0]
__________________________________________________________________________________________________
batch_normalization_87 (BatchNo (None, None, None, 3 1152        conv2d_87[0][0]
__________________________________________________________________________________________________
batch_normalization_91 (BatchNo (None, None, None, 3 1152        conv2d_91[0][0]
__________________________________________________________________________________________________
activation_87 (Activation)      (None, None, None, 3 0           batch_normalization_87[0][0]
__________________________________________________________________________________________________
activation_91 (Activation)      (None, None, None, 3 0           batch_normalization_91[0][0]
__________________________________________________________________________________________________
conv2d_88 (Conv2D)              (None, None, None, 3 442368      activation_87[0][0]
__________________________________________________________________________________________________
conv2d_89 (Conv2D)              (None, None, None, 3 442368      activation_87[0][0]
__________________________________________________________________________________________________
conv2d_92 (Conv2D)              (None, None, None, 3 442368      activation_91[0][0]
__________________________________________________________________________________________________
conv2d_93 (Conv2D)              (None, None, None, 3 442368      activation_91[0][0]
__________________________________________________________________________________________________
average_pooling2d_9 (AveragePoo (None, None, None, 2 0           mixed9[0][0]
__________________________________________________________________________________________________
conv2d_86 (Conv2D)              (None, None, None, 3 655360      mixed9[0][0]
__________________________________________________________________________________________________
batch_normalization_88 (BatchNo (None, None, None, 3 1152        conv2d_88[0][0]
__________________________________________________________________________________________________
batch_normalization_89 (BatchNo (None, None, None, 3 1152        conv2d_89[0][0]
__________________________________________________________________________________________________
batch_normalization_92 (BatchNo (None, None, None, 3 1152        conv2d_92[0][0]
__________________________________________________________________________________________________
batch_normalization_93 (BatchNo (None, None, None, 3 1152        conv2d_93[0][0]
__________________________________________________________________________________________________
conv2d_94 (Conv2D)              (None, None, None, 1 393216      average_pooling2d_9[0][0]
__________________________________________________________________________________________________
batch_normalization_86 (BatchNo (None, None, None, 3 960         conv2d_86[0][0]
__________________________________________________________________________________________________
activation_88 (Activation)      (None, None, None, 3 0           batch_normalization_88[0][0]
__________________________________________________________________________________________________
activation_89 (Activation)      (None, None, None, 3 0           batch_normalization_89[0][0]
__________________________________________________________________________________________________
activation_92 (Activation)      (None, None, None, 3 0           batch_normalization_92[0][0]
__________________________________________________________________________________________________
activation_93 (Activation)      (None, None, None, 3 0           batch_normalization_93[0][0]
__________________________________________________________________________________________________
batch_normalization_94 (BatchNo (None, None, None, 1 576         conv2d_94[0][0]
__________________________________________________________________________________________________
activation_86 (Activation)      (None, None, None, 3 0           batch_normalization_86[0][0]
__________________________________________________________________________________________________
mixed9_1 (Concatenate)          (None, None, None, 7 0           activation_88[0][0]
activation_89[0][0]
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, None, None, 7 0           activation_92[0][0]
activation_93[0][0]
__________________________________________________________________________________________________
activation_94 (Activation)      (None, None, None, 1 0           batch_normalization_94[0][0]
__________________________________________________________________________________________________
mixed10 (Concatenate)           (None, None, None, 2 0           activation_86[0][0]
mixed9_1[0][0]
concatenate_2[0][0]
activation_94[0][0]
__________________________________________________________________________________________________
global_average_pooling2d_1 (Glo (None, 2048)         0           mixed10[0][0]
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 1024)         2098176     global_average_pooling2d_1[0][0]
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, 200)          205000      dense_1[0][0]
==================================================================================================
Total params: 24,105,960
Trainable params: 2,303,176
Non-trainable params: 21,802,784
__________________________________________________________________________________________________

你的迁移学习模型结构如果是这样,就说明有问题了。

将上面的代码修改成这样就可以了:

[code]from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D, Input
from keras import backend as K

# create the base pre-trained model
Inp = Input((224, 224, 3))
base_model = InceptionV3(weights='imagenet', include_top=False, input_shape=(224,224,3))
x = base_model(Inp)
# add a global spatial average pooling layer
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(200, activation='softmax')(x)

# this is the model we will train
model = Model(inputs=Inp, outputs=predictions)

# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
layer.trainable = False

# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')

# train the model on the new data for a few epochs
model.fit_generator(...)

运行下model.summary()再看一下模型结构:

[code]_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_2 (InputLayer)         (None, 224, 224, 3)       0
_________________________________________________________________
inception_v3 (Model)         (None, 5, 5, 2048)        21802784
_________________________________________________________________
global_average_pooling2d_2 ( (None, 2048)              0
_________________________________________________________________
dense_3 (Dense)              (None, 1024)              2098176
_________________________________________________________________
dense_4 (Dense)              (None, 200)               205000
=================================================================
Total params: 24,105,960
Trainable params: 2,303,176
Non-trainable params: 21,802,784
_________________________________________________________________

看一下正确的结果:

[code]Epoch 1/20
1500/1500 [==============================] - 27s 18ms/step - loss: 2.4664 - binary_accuracy: 0.9125 - f1_keras: 0.0521 - val_loss: 1.4697 - val_binary_accuracy: 0.9456 - val_f1_keras: 0.0619
Epoch 2/20
1500/1500 [==============================] - 19s 13ms/step - loss: 1.2806 - binary_accuracy: 0.9467 - f1_keras: 0.0795 - val_loss: 1.2819 - val_binary_accuracy: 0.9466 - val_f1_keras: 0.0839
Epoch 3/20
1500/1500 [==============================] - 19s 13ms/step - loss: 1.0431 - binary_accuracy: 0.9526 - f1_keras: 0.1203 - val_loss: 1.3012 - val_binary_accuracy: 0.9468 - val_f1_keras: 0.0908
Epoch 4/20
1500/1500 [==============================] - 19s 13ms/step - loss: 0.9168 - binary_accuracy: 0.9555 - f1_keras: 0.1493 - val_loss: 1.3257 - val_binary_accuracy: 0.9445 - val_f1_keras: 0.0922
Epoch 5/20
1500/1500 [==============================] - 19s 13ms/step - loss: 0.8281 - binary_accuracy: 0.9577 - f1_keras: 0.1959 - val_loss: 1.3123 - val_binary_accuracy: 0.9468 - val_f1_keras: 0.0969

可以看出验证集的准确率正常了。细心的同学会发现验证集的f1分数与训练集还是有差距的,这是因为我为了测试模型所以只用了1500个样本训练,过拟合也很正常。

如果想解冻base_model的后N层,可以先运行下面代码,看看一共有多少层并且都是哪些层:

[code]for i, layer in enumerate(base_model.layers):
print(i, layer.name)

再根据需求解冻后N层

[code]for layer in model.layers[:-N]:
layer.trainable = False
for layer in model.layers[-N:]:
layer.trainable = True

解决了问题的同学,留个赞再走呀👍 

参考资料:https://github.com/keras-team/keras/pull/9965#discussion_r187806860

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐