Python爬虫实战(九):爬取动态网页
2017-10-29 22:50
513 查看
#coding=utf-8 import re import json import requests from prettytable import PrettyTable def getHtml(url): data = { 'page':1, 'num':40, 'sort':'symbol', 'asc':1, 'node':'cyb', 'symbol':'', '_s_r_a':'page'} headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0'} try: page = requests.post(url,data = data,headers = headers) page.encoding = 'gbk' html = page.text return html except: return "" def getdata(html): data = html.replace(':','":') data = data.replace(',',',"') data = data.replace('{','{"') data = data.replace('"{','{') data = re.sub('\d+":\d+":\d+','',data) data = json.loads(data) row = PrettyTable() row.field_names = ["代码", "名称", "最新价", "涨跌额","涨跌幅","买入","卖出","昨收","今开","最高" ,"最低","成交量/手","成交额/万"] for item in data: row.add_row((item['symbol'],item['name'],item['trade'],item['pricechange'],item['changepercent'] ,item['buy'],item['sell'],item['settlement'],item['open'],item['high'] ,item['low'],item['volume'],item['amount'])) print(row) if __name__=='__main__': url = 'http://vip.stock.finance.sina.com.cn/quotes_service/api/json_v2.php/Market_Center.getHQNodeData?' html = getHtml(url) getdata(html) #coding=utf-8 import re import json import requests from prettytable import PrettyTable def getHtml(url): data = { 'page.pageNo':2, 'tempPageSize':40, } headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0'} page = requests.post(url,headers = headers,data = data) html = page.text print (html) if __name__=='__main__': url = 'http://datacenter.mep.gov.cn:8099/ths-report/report!list.action?xmlname=1465594312346' getHtml(url)
相关文章推荐
- Python爬虫实战--(三)获取网页中的动态数据
- Python3网络爬虫:Scrapy入门实战之爬取动态网页图片
- Python爬虫实战(十一):两种简单的方法爬取动态网页
- Python爬虫实战(4):豆瓣小组话题数据采集—动态网页
- Python爬虫实战(动态网页)
- Python爬虫实战(4):豆瓣小组话题数据采集―动态网页
- python爬虫selenium+firefox抓取动态网页--表情包爬虫实战
- Python爬虫实战(三):简单爬取网页图片
- Python动态网页爬虫技巧Selenium(一)
- Python爬虫实战:将网页转换为pdf电子书
- python初级实战系列教程《三、爬虫之应对网页反爬虫》
- Python爬虫学习——使用selenium和phantomjs爬取js动态加载的网页
- python /selenium /动态网页 /爬虫
- python初级实战系列教程《一、爬虫之爬取网页、图片、音视频》
- Python爬虫实战--(二)解析网页中的元素
- python+Selenium2+chrome构建动态网页爬虫工具
- python爬虫(爬取豆瓣电影)_动态网页,json解释,中文编码
- Python 爬虫修养-处理动态网页
- python初级实战系列教程《二、爬虫之爬取网页小说》
- Python3网络爬虫:requests爬取动态网页内容