您的位置：首页 > 编程语言

理解梯度下降，随机梯度下降，附电影推荐系统的简单代码小样 2

2017-10-22 23:25 661 查看

这是这一title 的下半部分，主要是因为这个浏览器好像缓存不了那么多东西，所以写到某一个临界点的时候，总是崩溃，要死了我都。

最后一部分，老师给了八十万行的数据，让我们自行处理，本来是要按照上面的代码处理一下就好了，我自己写了个三维的图。

import pandas as pd

#three dimensions, x is item y is rating z is the num of people who rating this item

#axis x

x = np.array(list(set(Y4.item)))#could be considered as the name of movie

#axis y

rating_mean = Y4.rating.mean()

#Y4 represents the original dataframe (because there was a same Y as the question 3, I have changed it to make sure that there is no relation between these two questions)

Y4.rating -= Y4.rating.mean()

#get ratings

y = pd.DataFrame(np.linspace(rating_mean,rating_mean,x.shape[0]+1))# because we need to drop one column we need to add extra column

#gY.index=x

y = y.drop(0)#don't need to get a forloop to update the index, delete the column 0 directly

#get user 943

users = np.array(range(1,944))

def movie_stochastic_gradient(Y4, y):

gy = pd.DataFrame(np.zeros(y.shape), index=y.index)

random_user = users[np.random.randint(users.shape[0]-1)]#is the same as 'np.random.randint(users.size)'

items = list(set(Y4.item[Y4.user==random_user]))

#items = list(Y4.item[Y4.user==1])

#print(items)

#get all the ratings from this user (there are some same nums)

Y4_newform = Y4[Y4.user==random_user]#get a new form only belonged to this random_user and then we could easily get the rating

for item in items:

rating = list(set(Y4_newform.rating[Y4_newform.item==item]))[0]# in this form the same items and ratings have repeated several times

#print(y[item])

gy[0][gy.index==item] +=2*(y[0][y.index==item] - rating)

return gy

learning_rate = 0.01

iterations = 100

for i in range(iterations):

gy = movie_stochastic_gradient(Y4, y)

print('^_^ We have iterated', i, 'times.')

y -= learning_rate*gy

#print(y)

# axis z the num of users who rated the same film

z = np.zeros(x.shape) # index means the name of film, z[x] means one movie was rated for x times

for item in Y4.item:

z[item-1] += 1

#show the 3d map

import pylab as py

import mpl_toolkits.mplot3d.axes3d as p3

fig = py.figure()

ax = p3.Axes3D(fig)

ax.scatter(x,y,z)

ax.set_xlabel('film')

ax.set_ylabel('rating')

ax.set_zlabel('users_num')

fig.add_axes(ax)

py.show()

主要选取了三个变量，电影的名字，电影被评论的次数以及电影受到用户影响之后的评分。

最后的结果大概是这个样子：

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航