您的位置:首页 > 产品设计 > UI/UE

dataframe插入数据报错SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a

2018-03-28 10:42 489 查看
SettingWithCopyWarning 解决方案

场景

问题场景:我在读取csv文件之后,因为要新增一个特征列并根据已有特征修改新增列的值,结果在修改的时候就碰到了SettingWithCopyWarning这个警告,花了很长时间才解决这个问题。

案例:

import pandas as pd
import numpy as np

aa = np.array([1, 0, 1, 0])
bb = pd.DataFrame(aa.T, columns=['one'])
print(bb)
one
0    1
1    0
2    1
3    0
bb['two'] = 0
print(bb)
one  two
0    1    0
1    0    0
2    1    0
3    0    0


按条件修改新列再输出就报错了:

for i in range(bb.shape[0]):
if bb['one'][i] == 0:
bb['two'][i] = 1
print(bb)

C:/PycharmProjects/NaiveBayesProduct/pandas/try_index.py:22: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy bb['two'][i] = 1
one  two
0    1    0
1    0    1
2    1    0
3    0    1


解决方案

正确方案应该是生成好正确的数组再插入dataframe中。下面我把上面的例子用正确地方法再重新生成一遍。

import pandas as pd
import numpy as np

aa = np.array([1, 0, 1, 0])
bb = pd.DataFrame(aa.T, columns=['one'])
# 生成一个ndarray,装要插入的值
two = np.zeros(bb.shape[0])
# 按条件修改two
for i in range(bb.shape[0]):
if bb['one'][i] == 0:
two[i] = 1
# 完成后将two插入dataframe中
bb.insert(1,'two', two)
#insert 三个参数,插到第几列,该列列名,如果是bb.insert(0,'two', two),插入到第一列,
print(bb)

one  two
0    1  0.0
1    0  1.0
2    1  0.0
3    0  1.0


个人代码

个人案例代码:在进行利用朴素贝叶斯网络进行对评论进行分类的过程中,正向定义为1,负向定义为0.插入评论分析结果时报错

comm_data=pd.read_csv("C:\\Users\\lenovo\\Desktop\\comm\\new_data.csv",encoding="utf-8")
# comm_data=new_data
print(comm_data.head(5))
comm_data["classify"]="#"
for c in range(len(comm_data)):
classify=testingNB(comm_data["content"][c])
# print(classify)
comm_data["classify"][c]=classify
comm_data.to_csv("C:\\Users\\lenovo\\Desktop\\comm\\comm_data.csv")


出现报错:

D:/office3/python/python_py/compare/score_variance/get_data/web5_data_mg.py:161: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy comm_data["classify"][c]=classify


解决方案;

comm_data=pd.read_csv("C:\\Users\\lenovo\\Desktop\\comm\\new_data.csv",encoding="utf-8")
# comm_data=new_data
print(comm_data.head(5))
# comm_data["classify"]="#"
classify= np.zeros(comm_data.shape[0])
for c in range(len(comm_data)):
classifynb=testingNB(comm_data["content"][c])
# print(classify)
# comm_data["classify"][c]=classify
classify[c]=classifynb
comm_data(0,'classify', classify)
comm_data.to_csv("C:\\Users\\lenovo\\Desktop\\comm\\comm_data.csv")


这样问题就解决了。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐