您的位置:首页 > 编程语言 > C语言/C++

Using CURL to download a remote file from a valid URL in c++

2012-12-28 13:12 441 查看


Using CURL to download a remote file from a valid URL in c++


Introduction

Curl is an open source solution that compiles and runs under a wide variety of operating systems. It's used for transferring files with URL syntax, supporting FTP, FTPS, HTTP, HTTPS, SCP, SFTP, TFTP, TELNET, DICT, LDAP, LDAPS and FILE.

Curl supports SSL certificates, HTTP POST, HTTP PUT, FTP uploading, HTTP form based upload, proxies, cookies, user+password authentication (Basic, Digest, NTLM, Negotiate, kerberos...), file transfer resume, proxy tunneling and a busload of other useful tricks.

This article assume that you have Ubuntu as OS and Eclipse as IDE . To install Curl open "Synaptic Package Manager" (System → Administration) . And install the following packages :



After you creating a new Eclipse project (C++ project) go to "Project properties->C/C++ Build->Settings->Tool Settings tab".At "GCC C++ Linker" in "Libraries (-l)" list add "curl" . This is all we must to do to be able to link with libcurl.




Using Curl to Download a remote file from a valid URL

To use libcurl you must include in your project "
curl/curl.h
" . This file must be listed in your "Includes" section from your Eclipse project (probably in "/usr/include/curl").

Before calling any function from libcurl you must call the following function:

[code]CURLcode curl_global_init(long flags);


This function sets up the program environment that libcurl needs. Think of it as an extension of the library loader. The flags option is a bit pattern that tells libcurl exactly what features to init . Set the desired bits by ORing the values together.

In normal operation, you must specify
CURL_GLOBAL_ALL
. Don't use any other value unless you are familiar with it and mean to control internal operations of libcurl.

After you call
curl_global_init
you must create a Curl handle . To do that you must call
CURL
*curl_easy_init( )
. At the end of your application or when you want do release your Curl handle call
void curl_easy_cleanup(CURL * handle );


After you have created your handle you must to tell libcurl how to behave. By using the appropriate options to
curl_easy_setopt
, you can change libcurl's behavior.

All options are set with the option followed by a parameter. That parameter can be a long, a function pointer, an object pointer or a curl_off_t, depending on what the specific option expects. Read curl manual carefully as bad input values may cause libcurl
to behave badly!

You can only set one option in each function call. A typical application uses many
curl_easy_setopt()
calls in the setup phase.

To perfrom the file transfer ,call
CURLcode curl_easy_perform(CURL * handle );


This function is called after the init and all the
curl_easy_setopt()
calls are made, and will perform the transfer as described in the options.

You must never call this function simultaneously from two places using the same handle. Let the function return first before invoking it another time. If you want parallel transfers, you must use several curl handles.


The source code

The following source code can be used to download a file from a valid URL . Also if you want you can retrieve the server headers.

In our application we must define the following structure :

view plaincopy
to clipboardprint?

typedef struct _DATA

{

std::string* pstr;

bool bGrab;

} DATA;

std::string* pstr
- used as buffer. The URL content will be stored in this member.

bool bGrab
- indicate if we want to grab the content or we just want to send an request to the server without downloading content.

view plaincopy
to clipboardprint?

static size_t writefunction( void *ptr , size_t size , size_t nmemb , void *stream )

{

if ( !((DATA*) stream)->bGrab )

return -1;

std::string* pStr = ((DATA*) stream)->pstr;

if ( size * nmemb )

pStr->append((const char*) ptr, size * nmemb);

return nmemb * size;

}

This is a callback function which gets called by libcurl as soon as there is data received that needs to be saved. The size of the data pointed to by
ptr
is
size
multiplied
with
nmemb
, it will not be zero terminated. Return the number of bytes actually taken care of. If that amount differs from the amount passed to your function, it'll
signal an error to the library and it will abort the transfer and return
CURLE_WRITE_ERROR
.

The callback function will be passed as much data as possible in all invokes, but you cannot possibly make any assumptions. It may be one byte, it may be thousands. The maximum amount of data that can be passed to the write callback is defined in the
curl.h
header
file:
CURL_MAX_WRITE_SIZE
.

view plaincopy
to clipboardprint?

static bool DownloadURLContent( std::string strUrl , std::string & strContent,

std::string &headers,bool grabHeaders = true,

bool grabUrl = true )

{

CURL *curl_handle;

DATA data = { &strContent, grabUrl };

DATA headers_data = {&headers , grabHeaders};

if ( curl_global_init(CURL_GLOBAL_ALL) != CURLE_OK )

return false;

if ( (curl_handle = curl_easy_init()) == NULL )

return false;

#if 0

//just if you want to debug

if( curl_easy_setopt(curl_handle, CURLOPT_VERBOSE, 1)!= CURLE_OK)

goto clean_up;

if( curl_easy_setopt(curl_handle, CURLOPT_STDERR, stdout) != CURLE_OK)

goto clean_up;

#endif

char stdError[CURL_ERROR_SIZE] = { '\0' };

if ( curl_easy_setopt(curl_handle, CURLOPT_ERRORBUFFER , stdError) != CURLE_OK )

goto clean_up;

if ( curl_easy_setopt(curl_handle, CURLOPT_URL, strUrl.c_str()) != CURLE_OK )

goto clean_up;

if ( curl_easy_setopt(curl_handle, CURLOPT_WRITEFUNCTION, writefunction) != CURLE_OK )

goto clean_up;

if(grabHeaders)

{

if ( curl_easy_setopt(curl_handle, CURLOPT_HEADERFUNCTION, writefunction) != CURLE_OK )

goto clean_up;

if ( curl_easy_setopt(curl_handle, CURLOPT_WRITEHEADER, (void *)&headers_data) != CURLE_OK )

goto clean_up;

}

if ( curl_easy_setopt(curl_handle, CURLOPT_WRITEDATA, (void *)&data) != CURLE_OK )

goto clean_up;

if ( curl_easy_setopt(curl_handle, CURLOPT_USERAGENT, MY_USR_AGENT) != CURLE_OK )

goto clean_up;

if ( curl_easy_perform(curl_handle) != CURLE_OK )

if ( grabUrl )

goto clean_up;

curl_easy_cleanup(curl_handle);

curl_global_cleanup();

return true;

clean_up:

printf("(%s %d) error: %s", __FILE__,__LINE__, stdError);

curl_easy_cleanup(curl_handle);

curl_global_cleanup();

return false;

}

CURL *curl_handle; - our curl handle

DATA data = { &strContent, grabUrl }; //buffer for URL content and grabbing attribute

DATA headers_data = {&headers , grabHeaders}; // buffer for headers and grabbing attribute

In the following lines we set the curl handle options . For a detailed description of these you can visit the libcurl
documentation.

Hereinafter I will make a short presentation of all important curl options used in this article.

curl_easy_setopt(curl_handle, CURLOPT_URL, strUrl.c_str())
- set the URL address which will be grabbed.

curl
_easy_setopt(curl_handle, CURLOPT_WRITEFUNCTION, writefunction)
-
set the address of the callback function which will be used to retrieve the URL body

curl_easy_setopt(curl_handle, CURLOPT_WRITEDATA, (void *)&data)
- set the buffer where the callback function will store the body.

curl_easy_setopt(curl_handle, CURLOPT_HEADERFUNCTION, writefunction)
- set the address of the callback function which will be used to retrieve the headers. The same
function used to grab the URL body . We change just the storage buffer. Using the
CURLOPT_WRITEHEADER
option.

curl_easy_setopt(curl_handle, CURLOPT_WRITEHEADER, (void *)&headers_data)
- set the buffer where the callback function will store the headers.


Using this code

view plaincopy
to clipboardprint?

int main(void)

{

std::string content;

std::string headers;

if(CUrlGrabber::DownloadURLContent("http://www.intelliproject.net" , content , headers))

{

printf("Headers : %s \n", headers.c_str());

FILE *fp = fopen ("out.html", "w");

if(fp)

{

fwrite(content.c_str(), sizeof(char) , content.length(), fp);

fclose(fp);

}

else

{

printf("Could not open file: out.html!");

return 0;

}

return 1;

}

return 0;

}

As you can see the following application save the URL body to a file (out.html) and the headers are printed to the app console .



As you can see the main purpose and use for cURL is to automate unattended file transfers or sequences of operations. For example, it is a good tool for simulating a user's actions at a web browser.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐