XGBoost案例代码(一)——sklearn之交叉验证
2018-01-11 11:07
405 查看
#!/usr/bin/python ''' Created on 1 Apr 2015 @author: Jamie Hall ''' import pickle import xgboost as xgb import numpy as np from sklearn.model_selection import KFold, train_test_split, GridSearchCV from sklearn.metrics import confusion_matrix, mean_squared_error from sklearn.datasets import load_iris, load_digits, load_boston rng = np.random.RandomState(31337) print("Zeros and Ones from the Digits dataset: binary classification") digits = load_digits(2) y = digits['target'] X = digits['data'] kf = KFold(n_splits=2, shuffle=True, random_state=rng) for train_index, test_index in kf.split(X): xgb_model = xgb.XGBClassifier().fit(X[train_index], y[train_index]) predictions = xgb_model.predict(X[test_index]) actuals = y[test_index] print(confusion_matrix(actuals, predictions)) print("Iris: multiclass classification") iris = load_iris() y = iris['target'] X = iris['data'] kf = KFold(n_splits=2, shuffle=True, random_state=rng) for train_index, test_index in kf.split(X): xgb_model = xgb.XGBClassifier().fit(X[train_index], y[train_index]) predictions = xgb_model.predict(X[test_index]) actuals = y[test_index] print(confusion_matrix(actuals, predictions)) print("Boston Housing: regression") boston = load_boston() y = boston['target'] X = boston['data'] kf = KFold(n_splits=2, shuffle=True, random_state=rng) for train_index, test_index in kf.split(X): xgb_model = xgb.XGBRegressor().fit(X[train_index], y[train_index]) predictions = xgb_model.predict(X[test_index]) actuals = y[test_index] print(mean_squared_error(actuals, predictions)) print("Parameter optimization") y = boston['target'] X = boston['data'] xgb_model = xgb.XGBRegressor() clf = GridSearchCV(xgb_model, {'max_depth': [2,4,6], 'n_estimators': [50,100,200]}, verbose=1) clf.fit(X,y) print(clf.best_score_) print(clf.best_params_) # The sklearn API models are picklable print("Pickling sklearn API models") # must open in binary format to pickle pickle.dump(clf, open("best_boston.pkl", "wb")) clf2 = pickle.load(open("best_boston.pkl", "rb")) print(np.allclose(clf.predict(X), clf2.predict(X))) # Early-stopping X = digits['data'] y = digits['target'] X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0) clf = xgb.XGBClassifier() clf.fit(X_train, y_train, early_stopping_rounds=10, eval_metric="auc", eval_set=[(X_test, y_test)])
相关文章推荐
- xgboost调用sklearn的交叉验证,并且使用自定义的训练集、验证集进行模型的调参
- kaggle PredictingRedHatBusinessValue 简单的xgboost的交叉验证
- XGBoost参数调优完全指南(附Python代码)
- windows下Python机器学习依赖库安装——numpy、scipy、sklearn、xgboost、theano等
- 机器学习系列(12)_XGBoost参数调优完全指南(附Python代码)
- 机器学习系列(12)_XGBoost参数调优完全指南(附Python代码)
- sklearn中的交叉验证(Cross-Validation)
- sklearn(五)--------交叉验证
- Sklearn 中的 CrossValidation 交叉验证
- python中sklearn实现交叉验证
- 手把手教你可视化交叉验证代码,提高模型预测能力
- 转:Sklearn-CrossValidation交叉验证
- Python 之 sklearn 交叉验证 数据拆分
- XGBoost参数调优完全指南(附Python代码)
- [置顶] 【机器学习 sklearn】XGBoost and RandomForest
- 交叉验证在sklearn中的实现
- sklearn学习-SVM例程总结3(网格搜索+交叉验证——寻找最优超参数)
- 产生K-folder交叉验证的代码
- xgboost使用案例一
- 交叉验证在sklearn中的实现