Something About RMSEP
2016-07-11 19:36
281 查看
评价PLS的指标
They differ in the type of cases that are used to measure them:
RMSEC: calibration error, i.e. the residuals of the calibration data.(R)MSEC measures goodness of fit between your data and the calibration model. Depending on the type of data, model and application this can be subject to a huge optimistic bias due to overfitting compared to the (R)MSE observed for real cases when applying the calibration. If the model suffers from not being complex enough (underfitting), calibration error approximates prediction error. But it cannot indicate overfitting.
RMSECV: errors are calculated on test/train splits using a cross validation scheme for the splitting.
If the splitting of the data is done correctly, this gives a good estimate on how the model built on the data set at hand performs for unknown cases. However, due to the resampling nature of the approach, it actually measures performance for unknown cases that were obtained among the calibration cases. I.e. it does not measure how well the model works for cases that are measured months after calibration is done. For that, you need
MSEP/RMSEP: prediction error, i.e. measured on real cases and compared to reference values obtained for these.
RMSEP can measure e.g. how performance deteriorates over time (e.g. due to instrument drift), but only if the validation experiments have a design that allows to measure these influences.
General remarks:
I’d recommend to report for both cross validation and prediction errors in detail how the test cases are set apart, and for what factors independence was ensured.I regularly meet descriptions of “independent testing” (RMSEP) where acutally a single split of the calibration data was performed.A one-time split of the data obtained for calibration typically yields no better performance estimate than a cross validation. I claim this because in practice, most data leaks occur just as easily for one split as for the many splits in cross validation. Nevertheless, it may be easier to implement a protocol that in practice avoids these errors in a very transparent way for predicition error
相关文章推荐
- PLS-00402: alias required in SELECT list of cursor to avoid duplicate column
- PLS-00920: parameter plsql_native_library_dir is not set
- PLSR(偏最小二乘回归浅析)
- 【PLS】PLS-00201: identifier 'DBMS_REGISTRY' must be declared
- 1050. String Subtraction (20)
- 欢迎使用CSDN-markdown编辑器
- Qt之QParallelAnimationGroup
- SAP的WebService发布
- java学习之十(异常篇)
- 事件处理 onTouch onTouchEvent源码分析
- 基于Comparable接口实现二叉树操作
- webview加载html(textview不识别标签时,可以以此代替)
- Qt之QParallelAnimationGroup
- Qt之QParallelAnimationGroup
- Oracle常用SQL查询
- activity生命周期
- 如何生成二维码
- cfdf
- 动态链接库
- java 学习之九(集合)