您的位置：首页 > 理论基础 > 计算机网络

php 发送 http 请求

2015-08-18 07:28 507 查看

1.http

http报文：

起始行

首部

实体

请求报文：

<method> <request-UTL> <version>

<headers>

<entity-body>

响应报文：

<version> <status><reason-phrase>

<header>

<entity-body>

2.http request curl file_get_contents fopen fsockopen

socket是做通讯，邮件服务器

curl 模拟浏览器的大多的功能，效率高些，缓存了dns

file_get_contents 抓取文件

file_get_contents get or post(stream_context_create)

fopen get(fgets)

fsockopen (get or post) +fgets

curl

方法1: 用file_get_contents 以get方式获取内容

fsockopen

<?php

$url='http://www.domain.com/';

$html = file_get_contents($url);

echo $html;

?>

方法3：用file_get_contents函数,以post方式获取url

<?php

$data = array ('foo' => 'bar');

$data = http_build_query($data);

$opts = array (

'http' => array (

'method' => 'POST',

'header'=> "Content-type: application/x-www-form-urlencodedrn" .

"Content-Length: " . strlen($data) . "rn",

'content' => $data

)

);

$context = stream_context_create($opts);

$html = file_get_contents('http://localhost/e/admin/test.html', false, $context);

echo $html;

方法2: 用fopen打开url, 以get方式获取内容

<?php

$fp = fopen($url, 'r');

stream_get_meta_data($fp);

while(!feof($fp)) {

$result .= fgets($fp, 1024);

}

echo "url body: $result";

fclose($fp);

?>

方法4：用fsockopen函数打开url，以get方式获取完整的数据，包括header和body

复制代码代码如下:

<?php

function get_url ($url,$cookie=false)

{

$url = parse_url($url);

$query = $url[path]."?".$url[query];

echo "Query:".$query;

$fp = fsockopen( $url[host], $url[port]?$url[port]:80 , $errno, $errstr, 30);

if (!$fp) {

return false;

} else {

$request = "GET $query HTTP/1.1rn";

$request .= "Host: $url[host]rn";

$request .= "Connection: Closern";

if($cookie) $request.="Cookie: $cookien";

$request.="rn";

fwrite($fp,$request);

while()) {

$result .= @fgets($fp, 1024);

}

fclose($fp);

return $result;

}

}

//获取url的html部分，去掉header

function GetUrlHTML($url,$cookie=false)

{

$rowdata = get_url($url,$cookie);

if($rowdata)

{

$body= stristr($rowdata,"rnrn");

$body=substr($body,4,strlen($body));

return $body;

}

return false;

}

?>

方法5：用fsockopen函数打开url，以POST方式获取完整的数据，包括header和body

复制代码代码如下:

<?php

function HTTP_Post($URL,$data,$cookie, $referrer="")

{

// parsing the given URL

$URL_Info=parse_url($URL);

// Building referrer

if($referrer=="") // if not given use this script as referrer

$referrer="111″;

// making string from $data

foreach($data as $key=>$value)

$values[]="$key=".urlencode($value);

$data_string=implode("&",$values);

// Find out which port is needed – if not given use standard (=80)

if(!isset($URL_Info["port"]))

$URL_Info["port"]=80;

// building POST-request:

$request.="POST ".$URL_Info["path"]." HTTP/1.1n";

$request.="Host: ".$URL_Info["host"]."n";

$request.="Referer: $referern";

$request.="Content-type: application/x-www-form-urlencodedn";

$request.="Content-length: ".strlen($data_string)."n";

$request.="Connection: closen";

$request.="Cookie: $cookien";

$request.="n";

$request.=$data_string."n";

$fp = fsockopen($URL_Info["host"],$URL_Info["port"]);

fputs($fp, $request);

while(!feof($fp)) {

$result .= fgets($fp, 1024);

}

fclose($fp);

return $result;

}

?>

function MyCurl($url, $postfield='', $proxy='', $timeout=3, $format=0, $host=''){

$proxy=trim($proxy);

$user_agent ='Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)';

$ch = curl_init(); // 初始化CURL句柄

if(!empty($proxy)){

curl_setopt ($ch, CURLOPT_PROXY, $proxy);//设置代理服务器

}

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);

curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);

curl_setopt($ch, CURLOPT_URL, $url); //设置请求的URL

//curl_setopt($ch, CURLOPT_FAILONERROR, 1); // 启用时显示HTTP状态码，默认行为是忽略编号小于等于400的HTTP信息

//curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);//启用时会将服务器服务器返回的“Location:”放在header中递归的返回给服务器

curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);// 设为TRUE把curl_exec()结果转化为字串，而不是直接输出

curl_setopt($ch, CURLOPT_POST, 0);//启用POST提交

if($postfield){

curl_setopt($ch, CURLOPT_POSTFIELDS, $postfield); //设置POST提交的字符串

}

//curl_setopt($ch, CURLOPT_PORT, 80); //设置端口

curl_setopt($ch, CURLOPT_TIMEOUT, $timeout); // 超时时间

curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);

curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);//HTTP请求User-Agent:头

//curl_setopt($ch,CURLOPT_HEADER,1);//设为TRUE在输出中包含头信息

//$fp = fopen('example_homepage.txt', 'w');//输出文件

//curl_setopt($ch, CURLOPT_FILE, $fp);//设置输出文件的位置，值是一个资源类型，默认为STDOUT (浏览器)。

$http_header = array(

'Accept-Language: zh-cn',

'Connection: Keep-Alive',

'Cache-Control: no-cache'

);

if($host != false){

$http_header[] = 'Host:'.$host;

}

curl_setopt($ch,CURLOPT_HTTPHEADER,$http_header);//设置HTTP头信息

$data = curl_exec($ch); //执行预定义的CURL

$info=curl_getinfo($ch); //得到返回信息的特性

$httpCode =curl_getinfo($ch,CURLINFO_HTTP_CODE);

$errorNo = curl_errno($ch);

curl_close($ch);

if(0==$format)

{

return $data;

}

elseif(1==$format)

{

return array(

'data' => $data,

'info' => $info,

'errorNo' => $errorNo,

'errorMsg' => '',

'httpCode' => $httpCode

);

}

if( $httpCode != '200' )

{

return $data;

}

else

{

return $httpCode;

}

}

3.header

header的用法

header()函数的作用是：发送一个原始 HTTP 标头[Http Header]到客户端。

标头 (header) 是服务器以 HTTP 协义传 HTML 资料到浏览器前所送出的字串，在标头

与 HTML 文件之间尚需空一行分隔。有关 HTTP 的详细说明，可以参 RFC 2068 官方文件

(http://www.w3.org/Protocols/rfc2068/rfc2068)。

在 PHP 中送回 HTML 资料前，需先传完所有的标头。

使用范例

范例一: 本例使浏览器重定向到 PHP 的官方网站。

<?PHP

Header("Location:http://www.php.net";);

exit; //在每个重定向之后都必须加上“exit",避免发生错误后，继续执行。

?>

<?php

header("refresh:3;url=http://axgle.za.net");

print('正在加载，请稍等...<br>三秒后自动跳转~~~');

header重定向就等价于替用户在地址栏输入url

?>

范例二:禁止页面在IE中缓存

要使用者每次都能得到最新的资料，而不是 Proxy 或 cache中的资料，可以使用下列的标头

<?PHP

header('Expires: Mon, 26 Jul 1997 05:00:00 GMT');

header('Last-Modified: '.gmdate('D, d M Y H:i:s') .' GMT');

header('Cache-Control: no-store, no-cache,must-revalidate');

header('Cache-Control: post-check=0,pre-check=0',false);

header('Pragma: no-cache');//兼容http1.0和https

?>

CacheControl = no-cache

Pragma=no-cache

Expires = -1

Expires是个好东东，如果服务器上的网页经常变化，就把它设置为-1，表示立即过期。如果一个网页每天凌晨1点更新，可以把Expires设置为第二天的凌晨1点。

当HTTP1.1服务器指定CacheControl = no-cache时，浏览器就不会缓存该网页。

旧式 HTTP 1.0 服务器不能使用 Cache-Control 标题。所以为了向后兼容 HTTP 1.0服务器，IE使用Pragma:no-cache 标题对 HTTP 提供特殊支持。

如果客户端通过安全连接 (https://) 与服务器通讯，且服务器在响应中返回 Pragma:no-cache 标题，则Internet Explorer 不会缓存此响应。

注意：Pragma:no-cache仅当在安全连接中使用时才防止缓存，如果在非安全页中使用，处理方式与 Expires:-1相同，该页将被缓存，但被标记为立即过期。

http-equiv meta标记：

在html页面中可以用http-equiv meta来标记指定的http消息头部。老版本的IE可能不支持htmlmeta标记，所以最好使用http消息头部来禁用缓存。

范例三: 让使用者的浏览器出现找不到档案的信息。

网上很多资料这样写：php的函数header()可以向浏览器发送Status标头，

如　header(”Status: 404 Not Found”)。

但是我发现实际上浏览器返回的响应却是：

HTTP/1.x 200 OK

Date: Thu, 03 Aug 2006 07:49:11 GMT

Server: Apache/2.0.55 (Win32) PHP/5.0.5

X-Powered-By: PHP/5.0.5

Status: 404 Not Found

Content-Length: 0

Keep-Alive: timeout=15, max=98

Connection: Keep-Alive

Content-Type: text/html

查了一些资料，正确的写法是：

header(”http/1.1 404 NotFound”);

第一部分为HTTP协议的版本(HTTP-Version)；第二部分为状态代码(Status)；第三部分为原因短语(Reason-Phrase)。

范例四:让使用者下载档案( 隐藏文件的位置 )

html标签就可以实现普通文件下载。如果为了保密文件，就不能把文件链接告诉别人，可以用header函数实现文件下载。

<?php

header("Content-type: application/x-gzip");

header("Content-Disposition: attachment; filename=文件名");

header("Content-Description: PHP3 Generated Data");

?>

范例四:header函数前输入内容

一般来说在header函数前不能输出html内容，类似的还有setcookie() 和 session函数，这些函数需要在输出流中增加消息头部信息。如果在header()执行之前有echo等语句，当后面遇到header()时，就会报出“Warning: Cannot modify header information - headers already sentby….”错误。就是说在这些函数的前面不能有任何文字、空行、回车等，而且最好在header()函数后加上exit()函数。例如下面的错误写法，在两个php代码段之间有一个空行：

//some code here

?>

//这里应该是一个空行

header(”http/1.1 403 Forbidden”);

exit();

?>

原因是：PHP脚本开始执行时,它可以同时发送http消息头部(标题)信息和主体信息. http消息头部(来自 header() 或SetCookie() 函数)并不会立即发送,相反,它被保存到一个列表中. 这样就可以允许你修改标题信息,包括缺省的标题(例如Content-Type 标题）.但是,一旦脚本发送了任何非标题的输出（例如,使用 HTML 或 print()调用),那么PHP就必须先发送完所有的Header,然后终止 HTTPheader.而后继续发送主体数据.从这时开始,任何添加或修改Header信息的试图都是不允许的,并会发送上述的错误消息之一。

解决办法：

修改php.ini打开缓存(output_buffering),或者在程序中使用缓存函数ob_start()，ob_end_flush()等。原理是：output_buffering被启用时,在脚本发送输出时，PHP并不发送HTTPheader。相反，它将此输出通过管道（pipe）输入到动态增加的缓存中（只能在PHP4.0中使用，它具有中央化的输出机制）。你仍然可以修改/添加header，或者设置cookie，因为header实际上并没有发送。当全部脚本终止时，PHP将自动发送HTTP header到浏览器，然后再发送输出缓冲中的内容。

=================================================================

PHP 手册实例应用

1：您可以使用heder命令，强制使浏览器使用新鲜的内容（无缓存）。

也可以给网址增加了一个唯一的编号，使其每次都读取新的内容，避免缓存。

example:

<?

print"<imgsrc='yourfile.jpg'>"; //通常读取的是缓存文件

?>

<?

print"<imgsrc='yourfile.jpg?".time()."'>"; //增加了唯一的编号，使浏览器重新请求

w//print"<imgsrc='yourfile.jpg?".rand(100,999)."'>";

?>

2: 下面是个很好的函数，将图片传送给浏览器显示。

<?php

function PE_img_by_path($PE_imgpath = "")

{

if(file_exists($PE_imgpath)) {

$PE_imgarray = pathinfo($PE_imgpath);

$iconcontent = file_get_contents($PE_imgpath);

header("Content-type: image/" . $PE_imgarray["extension"]);

header('Content-length: ' . strlen($iconcontent));

echo $iconcontent;

die(0);

}

returnfalse;

}

?>

更多的实例：

<?php

// ok

header('HTTP/1.1 200 OK');

//设置一个404头:

header('HTTP/1.1 404 Not Found');

//设置地址被永久的重定向

header('HTTP/1.1 301 Moved Permanently');

//转到一个新地址

header('Location: http://www.example.org/');
//文件延迟转向:

header('Refresh: 10; url=http://www.example.org/');

print 'You will be redirected in 10 seconds';

//当然，也可以使用html语法实现

// <meta http-equiv="refresh"content="10;http://www.example.org/ />

// override X-Powered-By: PHP:

header('X-Powered-By: PHP/4.4.0');

header('X-Powered-By: Brain/0.6b');

//文档语言

header('Content-language: en');

//告诉浏览器最后一次修改时间

$time = time() - 60; // or filemtime($fn), etc

header('Last-Modified: '.gmdate('D, d M Y H:i:s', $time).'GMT');

//告诉浏览器文档内容没有发生改变

header('HTTP/1.1 304 Not Modified');

//设置内容长度

header('Content-Length: 1234');

//设置为一个下载类型

header('Content-Type: application/octet-stream');

header('Content-Disposition: attachment;filename="example.zip"');

header('Content-Transfer-Encoding: binary');

// load the file to send:

readfile('example.zip');

// 对当前文档禁用缓存

header('Cache-Control: no-cache, no-store, max-age=0,must-revalidate');

header('Expires: Mon, 26 Jul 1997 05:00:00 GMT'); // Date in thepast

header('Pragma: no-cache');

//设置内容类型:

header('Content-Type: text/html; charset=iso-8859-1');

header('Content-Type: text/html; charset=utf-8');

header('Content-Type: text/plain'); //纯文本格式

header('Content-Type: image/jpeg'); //JPG图片

header('Content-Type: application/zip'); // ZIP文件

header('Content-Type: application/pdf'); // PDF文件

header('Content-Type: audio/mpeg'); // 音频文件

header('Content-Type: application/x-shockwave-flash');//Flash动画

//显示登陆对话框

header('HTTP/1.1 401 Unauthorized');

header('WWW-Authenticate: Basic realm="Top Secret"');

print 'Text that will be displayed if the user hits cancel or';

print 'enters wrong login data';

http报文：

起始行

首部

实体

请求报文：

<method> <request-UTL> <version>

<headers>

<entity-body>

响应报文：

<version> <status><reason-phrase>

<header>

<entity-body>

详细:

--起始行

<method>:get post put head delete trace options

<request-UTL>：url

<version> :2.22 > 2.3 22>3

<status>:200 301 302 304 401 403 404 502 504

<reason-phrase> ok Moved Permanently Found Not Modified Unauthorized Forbidden Not Found Bad Gateway Gateway Timeout

--首部

<headers>

通用：信息（Connection Date MIME-Version Trailer Transfer-Encoding Update Via）+缓存(Cache-Control Pragma)

请求：

+信息（Client-IP From Host Referer UA-Color UA-CPU US-Disp US-OS UA-Pixels User-Agent）

+Accept（Accept Accept-Charset Accept-Encoding Accept-Language TE）

+条件（Expect If-Match If-Modified-Since If-Range If-Unmodified-Since Range）

+安全（Authorization Cookie Cookie2）

+代理（Max-Forword Proxy-Authorization Proxy-Connection）

响应：

+信息（Age Public Retry-After Server Title Warning）

+协商（Accept-Ranges Vary）

+安全（Proxy-Authenticate Set-Cookie Set-Cookie2 WWW-Authenticate）

+实体（Allow Location）

+内容（Content-Base Content-Encoding Content-Language Content-Length Content-Location Content-MD5 Content-Range Content-Type）

+实体缓存（ETag Expires Last-Modified）

+扩展

--实体：数字数据，图片、视频、HTML文档、软件应用程序、信用卡事务、电子邮件等

案例：百度

1.请求报文：

GET / HTTP/1.1 //请求方法为GET，HTTP协议为1.1

Host: www.baidu.com //URL为www.baidu.com

User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:19.0) Gecko/20100101 Firefox/19.0 //用户代理，也就是浏览器了，显示了浏览器的详细信息

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,;q=0.8 //服务器能够发送的文件类型text/html的意思是HTML文本文档类型，后面那些查文档去

Accept-Language: zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3 //服务器能够发送的语言 zh-cn为中文，后面那些查文档去

Accept-Encoding: gzip, deflate //服务器能够发送的编码格式为gzip，编码格式不符合浏览器会解释不了

Cookie: BAIDUID=AF6C346B14E94898933E5F858C63F889:FG=1; BDREFER=%7Burl%3A%22http%3A//news.baidu.com/%22%2Cword%3A%22%22%7D; H_PS_PSSID=2097_1464_2133_1944_1788 //cookie，服务器存储在客户端的信息，每次请求都会将服务器保存在客户端的cookie一并发送上服务器。

Connection: keep-alive //连接，keep-alive保持状态

Cache-Control: max-age=0 //随报文传送缓存指示 cache-control max-age>0 时直接从游览器缓存中提取 max-age<=0 时向server 发送http 请求确认 ,该资源是否有修改有的话返回200 ,无的话返回304.

2.响应报文：

HTTP/1.1 200 OK //HTTP版本 1.1 状态码200 原因短语OK

Date: Tue, 02 Apr 2013 04:27:50 GMT //响应的时间日期

Server: BWS/1.0 //服务器应用程序软件的名称和版本 BWS/1.0

Content-Length: 4271 //响应的主体内容的长度为4271个字节

Content-Type: text/html;charset=utf-8 //响应类型为HTML文本，编码类型为utf-8

Cache-Control: private //缓存指示

Expires: Tue, 02 Apr 2013 04:27:50 GMT //实体不在有效，要从原始的源端再次获取此实体的日期和时间

Content-Encoding: gzip //对主体执行的编码方式为gzip

Set-Cookie: H_PS_PSSID=2097_1464_2133_1944_1788; path=/; domain=.baidu.com //设置cookie，path,domain都是cookie的信息(作用范围等等)

Connection: Keep-Alive //状态为保持连接

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航