如何标准化特征向量HOW TO NORMALISE FEATURE VECTORS
2016-04-12 13:22
225 查看
HOW TO NORMALISE FEATURE VECTORS
I was trying to create a sample file for training a neural network and ran into a common problem: the feature values are all over the place. In this example I’m working with demographical real-world values for countries. For example, a feature for GDP per personin a country ranges from 551.27 to 88286.0, whereas estimates for corruption range between -1.56 to 2.42. This can be very confusing for machine learning algorithms, as they can end up treating bigger values as more important signals.
To handle this issue, we want to scale all the feature values into roughly the same range. We can do this by taking each feature value, subtracting its mean (thereby shifting the mean to 0), and dividing by the standard deviation (normalising the distribution).
This is a piece of code I’ve implemented a number of times for various projects, so it’s time to write a nice reusable script. Hopefully it can be helpful for others as well. I chose to do this in python, as it’s easies to run compared to C++ and Java (doesn’t
need to be compiled), but has better support for real-valued numbers compared to bash scripting.
Each line in the input file is assumed to be a feature vector, with values separated by whitespace. The first element is an integer class label that will be left untouched. This is followed by a number of floating point feature values which will be normalised.
For example:
To execute it, simply use
相关文章推荐
- html总结
- javascript设置了location.href不跳转问题
- JavaScript事件
- 简单处理excel 转成 json
- CEF生成JSON数据
- C# 对象与JSON互转
- js深化学习
- jQuery.extend 函数详解
- javascript面向对象编程笔记
- 将图像转换为特征向量Transforming Images to Feature Vectors
- js实现数组分组
- POJ 2761 Feed the dogs 求区间第k大 划分树
- 根据前序遍历和中序遍历重建二叉树
- 火狐中正常显示页面的CSS样式,在IE下完全不识别,页面全乱了
- Angularjs的ng-repeat中去除重复的数据
- JSP中<base href="<%=basePath%>">作用
- React-Native系列Android——Touch事件原理及状态效果
- 利用JS 阻止表单提交
- Codeforces 660E Different Subsets For All Tuples【组合数学】
- Codeforces 660E Different Subsets For All Tuples【组合数学】