ML之FE:特征工程中常用的一些处理手段(缺失值填充、异常值检测等)及其对应的底层代码的实现
2020-04-08 07:30
621 查看
ML之FE:特征工程中常用的一些处理手段(缺失值填充、异常值检测等)及其对应的底层代码的实现
目录
特征工程中常用的一些处理手段(缺失值填充、异常值检测等)及其对应的底层代码的实现
fillna(self, value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)
特征工程中常用的一些处理手段(缺失值填充、异常值检测等)及其对应的底层代码的实现
缺失值填充
[code]df = pd.read_csv('test01.csv') print(df['feature01']) df['feature02'] = df['feature01'].fillna(-1).astype(int) print(df['feature02'])
fillna(self, value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)
[code]@Appender(_shared_docs['fillna'] % _shared_doc_kwargs) def fillna(self, value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs): return super(DataFrame, self).fillna(value=value, method=method, axis=axis, inplace=inplace, limit=limit, downcast=downcast, **kwargs) df.fillna() @Appender(_shared_docs['fillna'] % _shared_doc_kwargs) def fillna(self, value=None, method=None, axis=None, inplace=False, limit=None, downcast=None): inplace = validate_bool_kwarg(inplace, 'inplace') if isinstance(value, (list, tuple)): raise TypeError('"value" parameter must be a scalar or dict, but ' 'you passed a "{0}"'.format(type(value).__name__)) self._consolidate_inplace() # set the default here, so functions examining the signaure # can detect if something was set (e.g. in groupby) (GH9221) if axis is None: axis = 0 axis = self._get_axis_number(axis) method = missing.clean_fill_method(method) from pandas import DataFrame if value is None: if method is None: raise ValueError('must specify a fill method or value') if self._is_mixed_type and axis == 1: if inplace: raise NotImplementedError() result = self.T.fillna(method=method, limit=limit).T # need to downcast here because of all of the transposes result._data = result._data.downcast() return result # > 3d if self.ndim > 3: raise NotImplementedError('Cannot fillna with a method for > ' '3dims') # 3d elif self.ndim == 3: # fill in 2d chunks result = dict([(col, s.fillna(method=method, value=value)) for (col, s) in self.iteritems()]) new_obj = self._constructor.from_dict(result). __finalize__(self) new_data = new_obj._data else: # 2d or less method = missing.clean_fill_method(method) new_data = self._data.interpolate(method=method, axis=axis, limit=limit, inplace=inplace, coerce=True, downcast=downcast) else: if method is not None: raise ValueError('cannot specify both a fill method and value') if len(self._get_axis(axis)) == 0: return self if self.ndim == 1: if isinstance(value, (dict, ABCSeries)): from pandas import Series value = Series(value) elif not is_list_like(value): pass else: raise ValueError("invalid fill value with a %s" % type(value)) new_data = self._data.fillna(value=value, limit=limit, inplace=inplace, downcast=downcast) elif isinstance(value, (dict, ABCSeries)): if axis == 1: raise NotImplementedError('Currently only can fill ' 'with dict/Series column ' 'by column') result = self if inplace else self.copy() for k, v in compat.iteritems(value): if k not in result: continue obj = result[k] obj.fillna(v, limit=limit, inplace=True, downcast=downcast) return result elif not is_list_like(value): new_data = self._data.fillna(value=value, limit=limit, inplace=inplace, downcast=downcast) elif isinstance(value, DataFrame) and self.ndim == 2: new_data = self.where(self.notnull(), value) else: raise ValueError("invalid fill value with a %s" % type (value)) if inplace: self._update_inplace(new_data) else: return self._constructor(new_data).__finalize__(self) Enter: apply completion. + Ctrl: remove arguments and replace current word (no Pop-up focus). + Shift: remove arguments (requires Pop-up focus).
- 点赞 1
- 收藏
- 分享
- 文章举报
相关文章推荐
- asp.net 一些常用的处理函数代码
- WCF技术剖析之二十二: 深入剖析WCF底层异常处理框架实现原理[下篇]
- JavaScript碰撞检测原理及其实现代码
- 一些常用日期处理代码
- 哈希表及其常用算法(代码实现)
- Roberts、Sobel、Prewitt、Kirsch、Canny、Laplacian图像边缘检测原理及其代码实现(Python3)
- 图像处理常用算法GPU实现二:基于微分的边缘检测
- 【分析函数】使用分析函数LAST_VALUE或11g LAG实现缺失数据填充及其区别
- Socket心跳包异常检测的C语言实现,服务器与客户端代码案例
- 车牌识别算法实现及其代码实现之一:车牌区域检测
- 关于合理使用SpringMVC统一异常处理机制以改善代码风格的一些思考
- [原创] WCF技术剖析之二十二: 深入剖析WCF底层异常处理框架实现原理[上篇]
- Asp.net Mvc 身份验证、异常处理、权限验证(拦截器)实现代码
- Socket心跳包异常检测的C语言实现,服务器与客户端代码案例
- Haar特征描述子及其代码实现
- 哈希表及其常用算法(代码实现)
- 图像处理常用算法GPU实现二:基于微分的边缘检测
- 一些常用到的代码实现
- C++实现MASK RCNN图片检测的过程中的一些代码问题。
- 用c++ try-catch 异常处理模块实现SEH底层异常处理。