您的位置:首页 > 理论基础 > 计算机网络

神经网络贷款风险评估(base on keras and python )

2017-08-18 14:35 489 查看
用我儿子的话说,有一天啊,小乌龟遇见小兔子………

有一天,我在网上看到这样一片文章,决策书做贷款决策分析。

贷还是不贷:如何用Python和机器学习帮你决策?

import pandas as pd
df = pd.read_csv('loans.csv')

#print(df.head())

X = df.drop('safe_loans', axis=1)

y = df.safe_loans

#change categorical

from sklearn.preprocessing import LabelEncoder
from collections import defaultdict
d = defaultdict(LabelEncoder)
X_trans = X.apply(lambda x: d[x.name].fit_transform(x))
X_trans.head()

#X_trans.to_excel('X_trans.xls')

#random take train and test

from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X_trans, y, random_state=1)
#call decision tree
from sklearn import tree
clf = tree.DecisionTreeClassifier(max_depth=8)
clf = clf.fit(X_train, y_train)

test_rec = X_test.iloc[1,:]
clf.predict([test_rec])

y_test.iloc[1]
from sklearn.metrics import accuracy_score
print(accuracy_score(y_test, clf.predict(X_test)))


这篇文章写的非常好,从中学到好多,但是计算的正确率不太高,8层的决策树正确率才能达到0.645

0.645480347467


我用神经网络重新做了计算

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Thu Aug 17 21:14:08 2017

@author: luogan
"""

#read data
import pandas as pd
df = pd.read_csv('loans.csv')

#print(df.head())

#X = df.drop('safe_loans', axis=1)

X = df.drop(['safe_loans' ],axis=1)
y = df.safe_loans

#change categorical

from sklearn.preprocessing import LabelEncoder
from collections import defaultdict
d = defaultdict(LabelEncoder)
X_trans = X.apply(lambda x: d[x.name].fit_transform(x))
X_trans.head()

#X_trans.to_excel('X_trans.xls')
##############
data_train=X_trans
data_max = data_train.max()
data_min = data_train.min()
data_mean = data_train.mean()
#
# data_std = data_train.std()
X_train1 = (data_train-data_max)/(data_max-data_min)

y=0.5*(y+1)
#random take train and test

from sklearn.cross_validation import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X_train1, y, random_state=1)

#x_train.to_excel('xx_trans.xls')

#y_train.to_excel('y_trans.xls')

#call decision tree
#from sklearn import tree
#clf = tree.DecisionTreeClassifier(max_depth=10)
#clf = clf.fit(X_train, y_train)

from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation

model = Sequential() #建立模型
model.add(Dense(input_dim = 12, output_dim = 48)) #添加输入层、隐藏层的连接
model.add(Activation('tanh')) #以Relu函数为激活函数

model.add(Dense(input_dim = 48, output_dim = 48)) #添加隐藏层、隐藏层的连接
model.add(Activation('relu')) #以Relu函数为激活函数
model.add(Dropout(0.2))

model.add(Dense(input_dim = 48, output_dim = 36)) #添加隐藏层、隐藏层的连接
model.add(Activation('relu')) #以Relu函数为激活函数
model.add(Dropout(0.2))
model.add(Dense(input_dim = 36, output_dim = 36)) #添加隐藏层、隐藏层的连接
model.add(Activation('relu')) #以Relu函数为激活函数

model.add(Dense(input_dim = 36, output_dim = 12)) #添加隐藏层、隐藏层的连接
model.add(Activation('relu')) #以Relu函数为激活函数
model.add(Dense(input_dim = 12, output_dim = 12)) #添加隐藏层、隐藏层的连接
model.add(Activation('relu')) #以Relu函数为激活函数

model.add(Dense(input_dim = 12, output_dim = 1)) #添加隐藏层、输出层的连接
model.add(Activation('sigmoid')) #以sigmoid函数为激活函数
#编译模型,损失函数为binary_crossentropy,用adam法求解
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x_train.values, y_train.values, nb_epoch = 70, batch_size = 2000) #训练模型

r = pd.DataFrame(model.predict_classes(x_test.values))
'''
r = pd.DataFrame(model.predict(x_test.values))
rr=r.values
tr=rr.flatten()

for i in range(tr.shape[0]):
if tr[i]>0.5:
tr[i]=1
else:

tr[i]=0
'''
from sklearn.metrics import accuracy_score
print(accuracy_score(y_test, r))


0.650640749978


我对神经网络进行了各种优化,正确率一直上不去,计算结果还不如我的神经网络股票预测代码

我不高兴,非常不高兴,65%的正确率是我无法容忍的,让我郁闷的是,我一直深爱的神经网络居然也这么无力

不能这样下去,下回我们将采用传说中的xgboost试一下

百度云代码下载

提取码 jc5x
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐