您的位置：首页 > 其它

机器学习----xgboost学习笔记

2017-07-13 19:56 483 查看

1、利用xgboost做特征组合

1）XGBModel.apply(self, X, ntree_limit=0)

return the predicted leaf every tree for each sample

X: 训练集特征，features matrix

ntree_limit: 预测时数的个数， Limit number of trees in the prediction; defaults to 0 (use all trees)。

def apply(self, X, ntree_limit=0):
"""Return the predicted leaf every tree for each sample.

Parameters
----------
X : array_like, shape=[n_samples, n_features]
Input features matrix.

ntree_limit : int
Limit number of trees in the prediction; defaults to 0 (use all trees).

Returns
-------
X_leaves : array_like, shape=[n_samples, n_trees]
For each datapoint x in X and for each tree, return the index of the
leaf x ends up in. Leaves are numbered within
``[0; 2**(self.max_depth+1))``, possibly with gaps in the numbering.
"""
test_dmatrix = DMatrix(X, missing=self.missing)
return self.get_booster().predict(test_dmatrix,
pred_leaf=True,
ntree_limit=ntree_limit)

GBDT与GBDT+LR区别

我理解如下：

GBDT：拟合上次预测后与实际结果的残差（即拟合[(y-y1^)-y^2]）.

GBDT+LR: 即将GBDT每课树预测的结果，通过线性再次组合，自动学习每次的权重。

菜鸟学习中，笔记方便自己后期学习理解，边学习边修改中，如有不正确的地方，烦请指正。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航