您的位置:首页 > 编程语言 > Python开发

支持度与置信度(基本示例)--《python数据挖掘入门与实践》

2018-03-05 19:24 645 查看
本文结合python数据挖掘入门与实践一书进行学习研究
python第三方库:Numpy

亲和性分析示例
1,使用Numpy导入数据集(txt数据文件)import numpy as np

dataset_filename="affinity_dataset.txt"
X=np.loadtxt(dataset_filename)
n_samples,n_features=X.shape
print("this dataset has {0} samples and {1} features".format(n_samples,n_features))
结果:This dataset has 100 samples and 5 features

书本中少了两行。
其中 X.shape 读取数据的行和列print(X[:5])
[[ 0.  0.  1.  1.  1.]
[ 1.  1.  0.  1.  0.]
[ 1.  0.  1.  1.  0.]
[ 0.  0.  1.  1.  1.]
[ 0.  1.  0.  0.  1.]]
2,将五列看成五种商品features = ["bread", "milk", "cheese", "apples", "bananas"]每一行代表一位顾客的购物情况,0代表没有购买该商品,1代表购买该商品

3,利用支持度和置信度的计算方法来计算 “如果顾客购买了苹果,他们也会购买香蕉”这条规则
统计购买苹果的顾客的数量 ,即 第四行的数值为1num_apple_purchases = 0
for sample in X:
if sample[3] == 1:
num_apple_purchases += 1
print("{0} people bought Apples".format(num_apple_purchases))一行一行的读取数据
判断sample[3]是否为1,从而判断顾客是否购买苹果
同样我们也可以通过sample[4] 来统计购买香蕉的人数

4,顾客在买苹果的同时也买香蕉的人数?
通过代码实现rule_valid = 0
rule_invalid = 0
for sample in X:
if sample[3] == 1: # 购买苹果
if sample[4] == 1:
# 既购买苹果也购买香蕉
rule_valid += 1
else:
# 购买苹果但不购买香蕉
rule_invalid += 1
print("{0} cases of the rule being valid were discovered".format(rule_valid))
print("{0} cases of the rule being invalid were discovered".format(rule_invalid))最后得到同时购买苹果和香蕉的人数 和 购买苹果但不购买香蕉的人数21 cases of the rule being valid were discovered
15 cases of the rule being invalid were discovered通过统计,我们可以得到顾客购买苹果也购买香蕉的支持度为 21 即 rule_valid 
置信度的算法为  同时购买苹果和香蕉的人数 /  买苹果的人数 即  rule_valid / num_apple_purchases

5,计算得到该规则的置信度 support = rule_valid # The Support is the number of times the rule is discovered.
confidence = rule_valid / num_apple_purchases
print("The support is {0} and the confidence is {1:.3f}.".format(support, confidence))
# Confidence can be thought of as a percentage using the following:
print("As a percentage, that is {0:.1f}%.".format(100 * confidence))置信度精确到小数点后三位,最后以百分制的形式显示The support is 21 and the confidence is 0.583.
As a percentage, that is 58.3%.所以我们可以通过置信度来显示消费者的消费欲望,从而制定合理的促销模式,达到利益的最大化。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息