knn算法实例(python)
2017-11-17 21:33
211 查看
参考地址(里面有解释和原数据)
import csv import random import math import operator def loadDataset(filename,split,trainingSet=[],testSet=[]): # 注意这儿加上'b'模式会出错,因为csv文件与普通文件不一样 with open(filename, 'r') as csvfile: lines = csv.reader(csvfile) dataset = list(lines) for x in range(len(dataset)-1): for y in range(4): dataset[x][y] = float(dataset[x][y]) if random.random() < split: trainingSet.append(dataset[x]) else: testSet.append(dataset[x]) def euclideanDistance(instance1, instance2, length): distance = 0 for x in range(length): distance += pow(instance1[x] - instance2[x], 2) return math.sqrt(distance) # test for function euclideanDistance # data1 = [2, 2, 2, 'a'] # data2 = [4, 4, 4, 'b'] # distance = euclideanDistance(data1, data2, 3) # print(distance) def getNeighbors(trainingSet, testInstance, k): distances = [] length = len(testInstance) - 1 for x in range(len(trainingSet)): dist = euclideanDistance(testInstance, trainingSet[x], length) distances.append((trainingSet[x], dist)) # print(distances) distances.sort(key=operator.itemgetter(1)) # print(distances) neighbors = [] for x in range(k): neighbors.append(distances[x][0]) return neighbors # test for function getNeighbors # trainSet = [[2, 2, 2, 'a'], [4, 4, 4, 'b'],[4.5, 4.5, 4.5, 'c']] # testInstance = [5, 5, 5] # k = 1 # neighbors = getNeighbors(trainSet, testInstance, 1) # print(neighbors) def getResponse(neighbors): classVotes = {} for x in range(len(neighbors)): response = neighbors[x][-1] if response in classVotes: classVotes[response] += 1 else: classVotes[response] = 1 # py3.+使用 items() 与2.+的 iteritems() 不同 sortedVotes = sorted(classVotes.items(), key=operator.itemgetter(1), reverse=True) return sortedVotes[0][0] # test for function getResponse # neighbors = [[1, 1, 1, 'a'], [2, 2, 2, 'a'], [3, 3, 3, 'b']] # response = getResponse(neighbors) # print(response) def getAccuracy(testSet, predictions): correct = 0 for x in range(len(testSet)): if testSet[x][-1] == predictions[x]: b22c correct += 1 return (correct/float(len(testSet)))*100.0 def main(): # prepare data trainingSet = [] testSet = [] loadDataset('f:/iris.csv', 0.66, trainingSet, testSet) print("Train" + repr(len(trainingSet))) print("Test" + repr(len(testSet))) # print(trainingSet) # generate predictions predictions = [] k = 3 for x in range(len(testSet)): neighbors = getNeighbors(trainingSet, testSet[x], k) # print(neighbors) result = getResponse(neighbors) predictions.append(result) print('> predicted=' + repr(result) + ', actual=' + repr(testSet[x][-1])) accuraty = getAccuracy(testSet, predictions) print('Accuracy: ' + repr(accuraty) + '%') main()
相关文章推荐
- 以Python代码实例展示kNN算法的实际运用
- 以Python代码实例展示kNN算法的实际运用
- (1)kNN算法_手写识别实例——基于Python和NumPy函数库
- [机器学习]kNN算法python实现(实例:数字识别)
- 步步学习之用python实战机器学习1-kNN (K-NearestNeighbors)算法(a)
- KNN近邻算法(python3)识别手写数字
- python快速查找算法应用实例
- Python实现基于KNN算法的笔迹识别功能详解
- Python有序查找算法之二分法实例分析
- KNN、k-近邻算法,python
- python训练贝叶斯算法进行某个实例的属实率预测
- scala---文档主题生成模型(LDA)算法原理及Spark MLlib调用实例(Scala/Java/python)
- Python数据结构与算法之图的最短路径(Dijkstra算法)完整实例
- 机器学习实战_初识kNN算法_理解其python代码
- 梯度迭代树(GBDT)算法原理及Spark MLlib调用实例(Scala/Java/python)
- KNN及其改进算法的python实现
- 【python算法】为了找到最小的可分配ID实例
- 广义线性模型(GLMs)算法原理及Spark MLlib调用实例(Scala/Java/Python)
- KNN分类算法及python代码实现
- KNN算法例子(java,scala,python 代码实现)