您的位置:首页 > 其它

R语言:使用rvest包进行数据简单抓取

2017-03-29 10:21 363 查看
本文主要介绍用rvest包对天气后报网的空气数据进行简单的抓取。

具体代码如下:

library(rvest)
html_session("http://www.tianqihoubao.com/aqi/chengdu-201612.html")
url1 <-"http://www.tianqihoubao.com/aqi/chengdu-201601.html"
url2 <-"http://www.tianqihoubao.com/aqi/chengdu-201602.html"
url3 <-"http://www.tianqihoubao.com/aqi/chengdu-201603.html"
url4 <-"http://www.tianqihoubao.com/aqi/chengdu-201604.html"
url5 <-"http://www.tianqihoubao.com/aqi/chengdu-201605.html"
url6 <-"http://www.tianqihoubao.com/aqi/chengdu-201606.html"
url7<-"http://www.tianqihoubao.com/aqi/chengdu-201607.html"
url8 <-"http://www.tianqihoubao.com/aqi/chengdu-201608.html"
url9 <-"http://www.tianqihoubao.com/aqi/chengdu-201609.html"
url10 <-"http://www.tianqihoubao.com/aqi/chengdu-201610.html"
url11 <-"http://www.tianqihoubao.com/aqi/chengdu-201611.html"
url12 <-"http://www.tianqihoubao.com/aqi/chengdu-201612.html"
fun <- function(x){web<-html(x,encoding="gb2312")
qq <- web %>% html_nodes("td") %>% html_text()
m <- matrix(qq,nrow=10)
p <- t(m)
p <- iconv(p,"utf-8","gbk")
p <- gsub("^\\s+|\\s+$","",p)
p[-1,]
}
p <- rbind(fun(url1),fun(url2),fun(url3),fun(url4),fun(url5),fun(url6),
fun(url7),fun(url8),fun(url9),fun(url10),fun(url11),fun(url12))
write.table(p,file="p.txt")


搜索结果如下图:



上述中批量的网址可以用paste0()函数+循环语句来实现。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  r语言