您的位置:首页 > 运维架构 > 网站架构

网站的轮播图的获取

2016-05-29 11:19 627 查看
获取某网站的轮播图

/**
* 获取轮播图
*
* @author Michael
* @param newsUrl
* @return
*/

public List<Map<String, String>> crawler4Pic(String newsUrl) {
List<Map<String, String>> picList = new ArrayList<Map<String, String>>();
try {
Document newsPageDoc = Jsoup.connect(newsUrl)
.header("Content-Type", "text/html; charset=GB2312")
.header("Accept-Language", "zh-CN,zh;q=0.8").timeout(3000)
.get(); // 获得当前页面的Dom
String picString = newsPageDoc.select("script").eq(5).toString()
.split("var data = ")[1].split(";")[0];
JSONArray picArray = JSONArray.fromObject(picString);
for (int i = 0; i < picArray.size(); i++) {
Map<String, String> picMap = new HashMap<String, String>();
String picUrl = picArray.getString(i).split("\"")[3];
String titleUrl = picArray.getString(i).split("\"")[7] + ":"
+ picArray.getString(i).split("\"")[11];
picMap.put("picUrl", picUrl);
picMap.put("title", titleUrl);
picList.add(picMap);
System.out.println("picUrl: " + picUrl + "   title:  "
+ titleUrl);
}
} catch (Exception e) {
e.printStackTrace();
}
// System.out.println("picList =" + picList);
return picList;
}
/**
* @param args
*/
public static void main(String[] args) {
BBMCCrowler bbmc = new BBMCCrowler();
bbmc.crawler4Pic("http://www.bbmc.edu.cn/");
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: