抓取网页信息PHP
2017-06-14 14:39
183 查看
<?php /* header("content-type:text/html;charset='utf-8'"); set_time_limit(0); $url="http://china.lottedfs.com/handler/ProductDetail-Start?productId=10000039734"; $str=file_get_contents($url); $str=mb_convert_encoding($str,"utf-8","GBK"); print_r($str);*/ function getPage ($url) { $useragent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.89 Safari/537.36'; $timeout= 120; $dir = dirname(__FILE__); $cookie_file = $dir . '/cookies/' . md5($_SERVER['REMOTE_ADDR']) . '.txt'; $ch = curl_init($url); curl_setopt($ch, CURLOPT_FAILONERROR, true); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file); curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true ); curl_setopt($ch, CURLOPT_ENCODING, "" ); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true ); curl_setopt($ch, CURLOPT_AUTOREFERER, true ); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout ); curl_setopt($ch, CURLOPT_TIMEOUT, $timeout ); curl_setopt($ch, CURLOPT_MAXREDIRS, 10 ); curl_setopt($ch, CURLOPT_USERAGENT, $useragent); curl_setopt($ch, CURLOPT_REFERER, 'http://www.google.com/'); $content = curl_exec($ch); if(curl_errno($ch)) { echo 'error:' . curl_error($ch); } else { return $content; } curl_close($ch); } $url="http://china.lottedfs.com/handler/ProductDetail-Start?productId=06042903112&viewCategoryId=5000110003&tracking="; $getdata=getPage($url); var_dump($getdata); $preg='/<meta property="rb:itemName" content="+([\s\S]*?)||(.*?)+"\/>/'; //$preg='/\<span[\s]*id\=\"id_product_nm\"\>([\s\S]*?)\<\/span\>/sim'; $preg='/\<meta[\s]*property\=\"rb:itemName\"[\s]*content\=\"(.*?)\" \/\>/sim'; $str=preg_match($preg,$getdata, $matches); print_r($matches); ?>
相关文章推荐
- php抓取网页信息
- php抓取alexa网页内容 提取站点统计信息
- php 实现信息采集(网页内容抓取)程序代码
- (PHP)用cURL抓取网页信息并替换部分内容
- 网页抓取信息(php正則表達式、php操作excel)
- 网页抓取信息(php正则表达式、php操作excel)
- php 抓取网页信息
- 实用PHP网页抓取
- C#实现通过程序自动抓取远程Web网页信息(转载)
- 实用PHP网页抓取
- 针对某个网页的快照以及某些重要信息的抓取代码解析_2
- 网页信息抓取实现
- PHP抓取网页内容汇总3
- PHP 抓取网页源文件
- PHP抓取网页和分析
- PHP抓取基本信息
- PHP抓取网页内容汇总2
- C#实现通过程序自动抓取远程Web网页信息
- PHP抓取网页内容汇总
- [转载] C#实现通过程序自动抓取远程Web网页信息