理解梯度下降,随机梯度下降,附电影推荐系统的简单代码小样 2
2017-10-22 23:25
661 查看
这是这一title 的下半部分,主要是因为这个浏览器好像缓存不了那么多东西,所以写到某一个临界点的时候,总是崩溃,要死了我都。
最后一部分,老师给了八十万行的数据,让我们自行处理,本来是要按照上面的代码处理一下就好了,我自己写了个三维的图。
import pandas as pd
#three dimensions, x is item y is rating z is the num of people who rating this item
#axis x
x = np.array(list(set(Y4.item)))#could be considered as the name of movie
#axis y
rating_mean = Y4.rating.mean()
#Y4 represents the original dataframe (because there was a same Y as the question 3, I have changed it to make sure that there is no relation between these two questions)
Y4.rating -= Y4.rating.mean()
#get ratings
y = pd.DataFrame(np.linspace(rating_mean,rating_mean,x.shape[0]+1))# because we need to drop one column we need to add extra column
#gY.index=x
y = y.drop(0)#don't need to get a forloop to update the index, delete the column 0 directly
#get user 943
users = np.array(range(1,944))
def movie_stochastic_gradient(Y4, y):
gy = pd.DataFrame(np.zeros(y.shape), index=y.index)
random_user = users[np.random.randint(users.shape[0]-1)]#is the same as 'np.random.randint(users.size)'
items = list(set(Y4.item[Y4.user==random_user]))
#items = list(Y4.item[Y4.user==1])
#print(items)
#get all the ratings from this user (there are some same nums)
Y4_newform = Y4[Y4.user==random_user]#get a new form only belonged to this random_user and then we could easily get the rating
for item in items:
rating = list(set(Y4_newform.rating[Y4_newform.item==item]))[0]# in this form the same items and ratings have repeated several times
#print(y[item])
gy[0][gy.index==item] +=2*(y[0][y.index==item] - rating)
return gy
learning_rate = 0.01
iterations = 100
for i in range(iterations):
gy = movie_stochastic_gradient(Y4, y)
print('^_^ We have iterated', i, 'times.')
y -= learning_rate*gy
#print(y)
# axis z the num of users who rated the same film
z = np.zeros(x.shape) # index means the name of film, z[x] means one movie was rated for x times
for item in Y4.item:
z[item-1] += 1
#show the 3d map
import pylab as py
import mpl_toolkits.mplot3d.axes3d as p3
fig = py.figure()
ax = p3.Axes3D(fig)
ax.scatter(x,y,z)
ax.set_xlabel('film')
ax.set_ylabel('rating')
ax.set_zlabel('users_num')
fig.add_axes(ax)
py.show()
主要选取了三个变量,电影的名字,电影被评论的次数以及电影受到用户影响之后的评分。
最后的结果大概是这个样子:
最后一部分,老师给了八十万行的数据,让我们自行处理,本来是要按照上面的代码处理一下就好了,我自己写了个三维的图。
import pandas as pd
#three dimensions, x is item y is rating z is the num of people who rating this item
#axis x
x = np.array(list(set(Y4.item)))#could be considered as the name of movie
#axis y
rating_mean = Y4.rating.mean()
#Y4 represents the original dataframe (because there was a same Y as the question 3, I have changed it to make sure that there is no relation between these two questions)
Y4.rating -= Y4.rating.mean()
#get ratings
y = pd.DataFrame(np.linspace(rating_mean,rating_mean,x.shape[0]+1))# because we need to drop one column we need to add extra column
#gY.index=x
y = y.drop(0)#don't need to get a forloop to update the index, delete the column 0 directly
#get user 943
users = np.array(range(1,944))
def movie_stochastic_gradient(Y4, y):
gy = pd.DataFrame(np.zeros(y.shape), index=y.index)
random_user = users[np.random.randint(users.shape[0]-1)]#is the same as 'np.random.randint(users.size)'
items = list(set(Y4.item[Y4.user==random_user]))
#items = list(Y4.item[Y4.user==1])
#print(items)
#get all the ratings from this user (there are some same nums)
Y4_newform = Y4[Y4.user==random_user]#get a new form only belonged to this random_user and then we could easily get the rating
for item in items:
rating = list(set(Y4_newform.rating[Y4_newform.item==item]))[0]# in this form the same items and ratings have repeated several times
#print(y[item])
gy[0][gy.index==item] +=2*(y[0][y.index==item] - rating)
return gy
learning_rate = 0.01
iterations = 100
for i in range(iterations):
gy = movie_stochastic_gradient(Y4, y)
print('^_^ We have iterated', i, 'times.')
y -= learning_rate*gy
#print(y)
# axis z the num of users who rated the same film
z = np.zeros(x.shape) # index means the name of film, z[x] means one movie was rated for x times
for item in Y4.item:
z[item-1] += 1
#show the 3d map
import pylab as py
import mpl_toolkits.mplot3d.axes3d as p3
fig = py.figure()
ax = p3.Axes3D(fig)
ax.scatter(x,y,z)
ax.set_xlabel('film')
ax.set_ylabel('rating')
ax.set_zlabel('users_num')
fig.add_axes(ax)
py.show()
主要选取了三个变量,电影的名字,电影被评论的次数以及电影受到用户影响之后的评分。
最后的结果大概是这个样子:
相关文章推荐
- 理解梯度下降,随机梯度下降,附电影推荐系统的简单代码小样 1.
- 随机梯度下降和批量梯度下降的简单代码实现
- 电影推荐系统设计思路(简单易懂的算法理解)
- 批梯度下降和随机梯度下降的区别和代码实现
- 对梯度下降法的简单理解
- 不到100行代码实现一个简单的推荐系统
- 对梯度下降的简单理解
- [置顶] Spark Mllib构建简单的电影推荐系统(转)
- 不到100行代码实现一个简单的推荐系统
- 梯度下降法的简单理解(含示例)
- [20180313智慧餐厅推荐系统02]基于python的socket编程代码,实现PC与服务器的简单通信
- 对梯度下降法的简单理解
- 深度学习(三十七)优化求解系列之(1)简单理解梯度下降
- 用Excel理解梯度下降:简单到拍案叫绝
- 电影推荐系统代码详细解释
- 基于Spark MLlib平台和基于模型的协同过滤算法的电影推荐系统(二)代码实现
- 深度学习(三十七)优化求解系列之(1)简单理解梯度下降
- 通过一个简单的时间片轮转多道程序内核代码,分析linux操作系统系统
- 梯度、梯度下降,随机梯度下降
- 用Hadoop构建电影推荐系统