您的位置:首页 > 理论基础 > 计算机网络

httpWebRequest获取流和WebClient的文件抓取

2017-07-12 11:08 253 查看
httpWebRequest获取流和WebClient的文件抓取

昨天写一个抓取,遇到了一个坑,就是在获取网络流的时候,人为的使用了stream.Length来获取流的长度,获取的时候会抛出错误,查了查文档,原因是某些流是无法获取到数据的长度的,所以不能直接得到。如果是常和stream打交道就能避免这个问题。其实直接使用do-while来获取就行了,代码如下:

int i=0;
do
{
byte[] buffer = new byte[1024];

i = stream.Read(buffer, 0, 1024);

fs.Write(buffer, 0, i);

} while (i >0);

其中while后只能写i>0;而不能写成i>=1024;原因可以看MSDN中的一段解释:msdn


仅当流中没有更多数据且预期不会有更多数据(如套接字已关闭或位于文件结尾)时,Read 才返回 0。 即使尚未到达流的末尾,实现仍可以随意返回少于所请求的字节。


一下是httpwebrequest和webClient抓取数据的简短代码:

httpWebRequest

/// <summary>
///
/// </summary>
/// <param name="url">抓取url</param>
/// <param name="filePath">保存文件名</param>
/// <param name="oldurl">来源路径</param>
/// <returns></returns>
public static bool HttpDown(string url, string filePath, string oldurl)
{
try
{
HttpWebRequest req = WebRequest.Create(url) as HttpWebRequest;

req.Accept = @"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
";
req.Referer = oldurl;
req.UserAgent = @" Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.154 Safari/537.36
";
req.ContentType = "application/octet-stream";

HttpWebResponse response = req.GetResponse() as HttpWebResponse;

Stream stream = response.GetResponseStream();

// StreamReader readStream=new StreamReader

FileStream fs = File.Create(filePath);

long length = response.ContentLength;

int i=0; do { byte[] buffer = new byte[1024]; i = stream.Read(buffer, 0, 1024); fs.Write(buffer, 0, i); } while (i >0);

fs.Close();

return true;
}
catch (Exception ex)
{
return false;
}

}

WebClient

public static bool Down(string url, string desc,string oldurl)
{
try
{
WebClient wc = new WebClient();
wc.Headers.Add(HttpRequestHeader.Accept, @"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
");

wc.Headers.Add(HttpRequestHeader.Referer, oldurl);
wc.Headers.Add(HttpRequestHeader.UserAgent, @" Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.154 Safari/537.36
");
wc.Headers.Add(HttpRequestHeader.ContentType, "application/octet-stream");

wc.DownloadFile(new Uri(url), desc);

Console.WriteLine(url);
Console.WriteLine("    "+desc + "   yes!");
return true;

}
catch (Exception ex)
{
return false;
}

}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: