代理抓取页面,获得访问地址的最终跳转地址
2012-12-17 16:32
176 查看
<%@page import="java.net.URI"%> <%@page import="java.io.IOException"%> <%@page import="org.apache.http.HttpHost"%> <%@page import="org.apache.http.HttpResponse"%> <%@page import="org.apache.http.HttpStatus"%> <%@page import="org.apache.http.client.ClientProtocolException"%> <%@page import="org.apache.http.client.methods.HttpGet"%> <%@page import="org.apache.http.client.methods.HttpUriRequest"%> <%@page import="org.apache.http.impl.client.DefaultHttpClient"%> <%@page import="org.apache.http.protocol.BasicHttpContext"%> <%@page import="org.apache.http.protocol.ExecutionContext"%> <%@page import="org.apache.http.protocol.HttpContext"%> <%@page import="org.apache.http.client.utils.URLEncodedUtils"%> <%@page import="java.net.URLEncoder"%> <%@page import="java.io.UnsupportedEncodingException"%> <%@page import="org.apache.http.impl.client.DefaultRedirectHandler"%> <%@page import="org.apache.http.ProtocolException"%> <%@page import="org.apache.http.Header"%> <%@page import="java.net.URISyntaxException"%> <%@ taglib uri="http://java.sun.com/jstl/core" prefix="c" %> <%@ taglib uri="http://www.duxiu.com/proxy" prefix="proxy" %> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <%! class CustomRedirectHandler extends DefaultRedirectHandler { @Override public URI getLocationURI(HttpResponse response, HttpContext context) throws ProtocolException{ if(isRedirectRequested( response, context)) { Header locationHeader = response.getFirstHeader("location"); String location= locationHeader.getValue(); if(location!=null&&!"".equals(location)&&!location.startsWith("http")&&location.contains("---")){ response.removeHeaders("location"); response.setHeader("location","-----"+location); URI uri=null; try { uri = new URI("------"+location.substring(0, location.lastIndexOf("url=") + 4) + URLEncoder.encode(location.substring(location.indexOf("url=") + 4, location.length()))); } catch (URISyntaxException e) { e.printStackTrace(); } return uri; } } return super.getLocationURI(response,context); } } %> <%!public String test1(String url) { DefaultHttpClient httpClient = new DefaultHttpClient(); CustomRedirectHandler handler=new CustomRedirectHandler(); httpClient.setRedirectHandler(handler); HttpGet httpget = new HttpGet(url); HttpContext context = new BasicHttpContext(); HttpResponse response = null; try { response = httpClient.execute(httpget, context); } catch (ClientProtocolException e1) { e1.printStackTrace(); } catch (IOException e1) { e1.printStackTrace(); } if (response.getStatusLine().getStatusCode() != HttpStatus.SC_OK) try { throw new IOException(response.getStatusLine().toString()); } catch (IOException e) { e.printStackTrace(); } HttpUriRequest currentReq = (HttpUriRequest) context.getAttribute(ExecutionContext.HTTP_REQUEST); HttpHost currentHost = (HttpHost) context.getAttribute(ExecutionContext.HTTP_TARGET_HOST); String currentUrl = (currentReq.getURI().isAbsolute()) ? currentReq.getURI().toString(): (currentHost.toURI() + currentReq.getURI()); return currentUrl; } %> <% String dx = request.getParameter("dx"); if(dx==null||"".equals(dx)) { out.println("dx为空!"); return; } // 获得最终访问地址 String url =dx; out.println("url="+url); String finalURL=test1(url); //out.println("finalURL="+finalURL); if(!url.equals(finalURL)){ response.sendRedirect("最终跳转地址"); } %>
继承DefaultRedirectHandler,重写获得URI方法-----
相关文章推荐
- 如果获得页面跳转的最终URL
- 代理访问网页,拿到图片,文件最终指向地址
- PHP利用REFERER根居访问来地址进行页面跳转
- javaWeb项目用过滤器filter实现登陆成功后才能访问主页面,否则直接输入主页面的地址自动跳转到登陆界面
- 使用HttpClient获得Ur最终跳转页面信息
- C#抓取页面时候,获取页面跳转后的地址
- PHP利用REFERER根居访问来地址进行页面跳转
- webview缓存及跳转时截取url地址、监听页面变化
- 获取上一个页面的跳转地址
- web.xml配置SpringMVC时导致访问的页面资源不存在,跳转页面时出现404
- 模拟页面跳转,使用代理反向传值
- 根据Email地址跳转到相应的邮箱登录页面
- nginx规则:自动降级,手机用户访问跳转手机版与PC版页面
- JS获取上一访问页面URL地址——(上)
- JS获取上一访问页面URL地址——(下)
- ThinkPHP访问不存在的模块跳转到404页面的方法
- Javascript获得当前网页页面详细地址
- 基于JS实现移动端访问PC端页面时跳转到对应的移动端网页
- 如果是手机访问则跳转页面
- ecshop其他页面判断是智能手机访问也跳转到ECTouch对应手机版页面(转)