您的位置:首页 > 编程语言 > Python开发

抓取淘宝评论

2017-12-21 20:11 134 查看
import requests

headers={
'user-agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36'
}
'''
url='https://rate.tmall.com/list_detail_rate.htm?itemId=529571326323\
&spuId=524781228&sellerId=2484777365&order=3¤tPage=2&\
append=0&content=1&tagId=&posi=&picture=&ua=098'
res=requests.get(url,headers=headers)
#print(res.text)
'''
'''
import json
jd=json.loads(res.text.lstrip("\r\n\njsonp1510(").rstrip(")"))
#print(jd)

comment=jd['rateDetail']['rateCount']['total']
#print(comment)
a=jd['rateDetail']['rateList']
#print(a)

for b in a:
c=b['appendComment']
print(c)
'''
import re
'''
for i in range(1,11):
url='https://rate.tmall.com/list_detail_rate.htm?itemId=529571326323&spuId=524781228&sellerId=2484777365&order=3¤tPage={}&append=0&content=1&tagId=&posi=&picture=&ua=098'.format(str(i))
res=requests.get(url,headers=headers)
pat='ontent":(.*?)","'
com=re.findall(pat,res.text)
for a in com:
print(a)

'''
f=open('C:/Users/Administrator/Desktop/taobao_comments.txt','a+')
for i in range(1,11):
try:
url='https://rate.tmall.com/list_detail_rate.htm?itemId=529571326323&spuId=524781228&sellerId=2484777365&order=3¤tPage={}&append=0&content=1&tagId=&posi=&picture=&ua=098'.format(str(i))
res=requests.get(url,headers=headers)
pat='ontent":(.*?)","'
com=re.findall(pat,res.text)
for a in com:
print(a)
f.write(a)
f.write('\n')
except:
print('爬取第'+str(i)+'页出现问题')
continue
4000
f.close()
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  python