您的位置：首页 > 理论基础 > 计算机网络

httpClient抓取网页并存储mht格式的文件

2011-08-19 11:11 363 查看

求高手援助

我已经通过httpClient抓取百度网页的html标签，现在将存储为mht格式按以下方法。存储htm格式文件没问题，但是存储mht就不行，该如何解决

江湖救急啊。

//构造HttpClient的实例

HttpClient client = new HttpClient();

// 创建GET方法的实例

GetMethod getMethod = new GetMethod(" http://www.baidu.com ");

// 使用系统提供的默认的恢复策略

getMethod.getParams().setParameter(HttpMethodParams.RETRY_HANDLER,new DefaultHttpMethodRetryHandler());

try {

// 执行getMethod

int statusCode = client.executeMethod(getMethod);

if (statusCode != HttpStatus.SC_OK) {

System.err.println( " Method failed: "

+ getMethod.getStatusLine());

}

// 读取内容

byte [] responseBody = getMethod.getResponseBody();

// 处理内容打印html标签

//System.out.println( new String(responseBody));

//将页面信息输出htm文件

// FileOutputStream fos=new FileOutputStream("c:/Users/wenjiao/Desktop/1.htm");

// fos.write(responseBody);

// fos.flush();

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航