您的位置:首页 > 其它

ML之回归预测:利用十类机器学习算法(线性回归、kNN、SVM、决策树、随机森林、极端随机树、SGD、提升树、LightGBM、XGBoost)对波士顿数据集进行回归预测(房价预测)以及模型评估

2020-09-02 22:43 2381 查看

ML之回归预测:利用十类机器学习算法(线性回归、kNN、SVM、决策树、随机森林、极端随机树、SGD、提升树、LightGBM、XGBoost)对波士顿数据集【13+1,506】进行回归预测(房价预测)以及模型评估

 

 

 

目录

利用十类机器学习算法(线性回归、kNN、SVM、决策树、随机森林、极端随机树、SGD、提升树、LightGBM、XGBoost)对波士顿数据集【13+1,506】进行回归预测(房价预测)以及模型评估

输出数据集

1、线性回归

2、kNN

3、SVM

4、决策树

5、随机森林

6、极端随机树

7、SGD

8、提升树

9、LightGBM

10、XGBoost

模型效果综合比较

 


相关文章
ML之回归预测:利用十类机器学习算法(线性回归、kNN、SVM、决策树、随机森林、极端随机树、SGD、提升树、LightGBM、XGBoost)对波士顿数据集【13+1,506】进行回归预测(房价预测)以及模型评估
ML之回归预测:利用十类机器学习算法(线性回归、kNN、SVM、决策树、随机森林、极端随机树、SGD、提升树、LightGBM、XGBoost)对波士顿数据集【13+1,506】进行回归预测(房价预测)以及模型评估实现

利用十类机器学习算法(线性回归、kNN、SVM、决策树、随机森林、极端随机树、SGD、提升树、LightGBM、XGBoost)对波士顿数据集【13+1,506】进行回归预测(房价预测)以及模型评估

输出数据集

[code]数据集的描述:
.. _boston_dataset:

Boston house prices dataset
---------------------------

**Data Set Characteristics:**

:Number of Instances: 506

:Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.

:Attribute Information (in order):
- CRIM     per capita crime rate by town
- ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
- INDUS    proportion of non-retail business acres per town
- CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
- NOX      nitric oxides concentration (parts per 10 million)
- RM       average number of rooms per dwelling
- AGE      proportion of owner-occupied units built prior to 1940
- DIS      weighted distances to five Boston employment centres
- RAD      index of accessibility to radial highways
- TAX      full-value property-tax rate per $10,000
- PTRATIO  pupil-teacher ratio by town
- B        1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
- LSTAT    % lower status of the population
- MEDV     Median value of owner-occupied homes in $1000's

:Missing Attribute Values: None

:Creator: Harrison, D. and Rubinfeld, D.L.

This is a copy of UCI ML housing dataset.
https://archive.ics.uci.edu/ml/machine-learning-databases/housing/

This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University.

The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic
prices and the demand for clean air', J. Environ. Economics & Management,
vol.5, 81-102, 1978.   Used in Belsley, Kuh & Welsch, 'Regression diagnostics
...', Wiley, 1980.   N.B. Various transformations are used in the table on
pages 244-261 of the latter.

The Boston house-price data has been used in many machine learning papers that address regression
problems.

.. topic:: References

- Belsley, Kuh & Welsch, 'Regression diagnostics: Identifying Influential Data and Sources of Collinearity', Wiley, 1980. 244-261.
- Quinlan,R. (1993). Combining Instance-Based and Model-Based Learning. In Proceedings on the Tenth International Conference of Machine Learning, 236-243, University of Massachusetts, Amherst. Morgan Kaufmann.

数据的初步查验:输出回归目标值target的差异
target_max 50.0
target_min 5.0
target_avg 22.532806324110677

 

1、线性回归

[code]LiR Score value: 0.6757955014529482
LiR R2    value: 0.6757955014529482
LiR MAE   value: 3.5325325437053974
LiR MSE   value: 25.13923652035344

 

2、kNN

 

3、SVM

 

 

4、决策树

 

5、随机森林

 

6、极端随机树

 

7、SGD

 

 

 

8、提升树

 

 

9、LightGBM

 

10、XGBoost

 

 

模型效果综合比较

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐