您的位置:首页 > 大数据 > 人工智能

Uploading Files Using CGI and Perl

2008-06-02 15:35 363 查看
Uploading Files Using CGI and Perl

Would you like to give your visitors the ability to upload files to your site? Letting them upload content with their Web browsers can be very useful, and fun too! You can let them contribute pictures, sounds and other binary files to your site. And you can use a file upload facility on your own Website to update your site's content easily via your own Web browser.

If you've ever used a Web-based email service such as Yahoo! Mail or Hotmail, you've probably sent email with attachments. To add attachments to your emails, you simply click the "Browse..." button on the Web page to select the file from your hard drive, and then your browser sends the file to the server. This is file upload in action!

But how does it work? In this article, which has been updated from an earlier version I wrote a few years ago, I'm going to talk you through the process of file upload, and show you how to build a simple file upload example using CGI and Perl -- that's right, Perl! Despite the hype over other scripting languages, Perl is still a powerful and popular choice to power a web site. The example we'll go through will allow people to upload photos of themselves to your Web server.

What You'll Need

To build your own file upload script, you'll need the following:

·     Access to a Web server that supports CGI (nearly all do)

·     A copy of Perl running on the Web server

·     The Perl CGI library, CGI.pm, installed on your Web server. This is probably pre-installed, but if it's not, you can grab it here.

 

How Does It Work?

File upload works by using a special type of form field called "file", and a special type of form encoding called "multipart/form-data". The file form field displays a text box for the filename of the file to upload, and a "Browse..." button:

The file form field displaying a text box and a Browse button (click to view image)

The user clicks the "Browse..." button to bring up the file selector, and chooses the file they wish to upload. Then, when they click the "Submit" button on the form, the file's data is uploaded to the Web server, along with the rest of the form's data:

Clicking Submit to send the data to the server (click to view image)

At the Web server end, the software (in our case, a CGI script) interprets the form data that's sent from the browser, and extracts the file name and contents, along with the other form fields. Usually, the file is then saved to a directory on the server.

Now, let's create a file upload form that allows your users to upload files to your Web server.

1. The "form" Element

The first part of a file upload form is the "form" element:

<form action="/cgi-bin/upload.cgi" method="post"  
enctype="multipart/form-data">

Note the special multipart/form-data encoding type, which is what we use for file upload. Note also that the form will post the data to our upload script, called upload.cgi, which we'll create in the next section.

2. The File Upload Field

The second part of the file upload form is the upload field itself. In this example, we're creating a form so that our users can upload their photos, so we need an upload field called "photo":

<p>Photo to Upload: <input type="file" name="photo" /></p>

3. Other Form Fields

You can include other, normal form fields in your form as well as the above field. Here we're going to allow users to submit their email address along with their photo:

<p>Your Email Address: <input type="text" name="email_address" /></p>

4. The Submit Button

As with a regular form, we need a submit button so that the user can send the form to the Web server:

<p><input type="submit" name="Submit" value="Submit Form" /></p>

The Finished Form

Our complete file upload form looks like this:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 <head>
   <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
   <title>File Upload</title>
 </head>
 <body>
   <form action="/cgi-bin/upload.cgi" method="post"  
enctype="multipart/form-data">
     <p>Photo to Upload: <input type="file" name="photo" /></p>
     <p>Your Email Address: <input type="text" name="email_address" /></p>
     <p><input type="submit" name="Submit" value="Submit Form" /></p>
   </form>
 </body>
</html>

Save this file to your hard drive, and call it something like "file_upload.html".

So far, so good! Now let's look at how to write the server CGI script, upload.cgi.

Creating the File Upload Script

Handling the data that the browser sends when it uploads a file is quite a complex process. Fortunately, the Perl CGI library, CGI.pm, does most of the dirty work for us!

Using two methods of the CGI query object, param and upload, we can retrieve the uploaded file's filename and file handle, respectively. Using the file handle, we can read the contents of the file, and save it to a new file in our file upload area on the server.

1. First Things First

At the top of our script, we need to create the shebang line. We then put the Perl interpreter into strict mode to make our script as safe as possible, and include the Perl CGI and File::Basename modules for use in the script. We'll also use the CGI::Carp module to display errors in the web page, rather than displaying a generic "500 Server Error" message (it's a good idea to comment out this line in a production environment):

#!/usr/bin/perl -wT

use strict;
use CGI;
use CGI::Carp qw ( fatalsToBrowser );
use File::Basename;

Note the use of the -w switch to make Perl warn us of any potential dangers in our code. It's nearly always a good idea to put the -w in! In addition, the -T switch turns on taint checking. This ensures that any untrusted input to the script, such as the uploaded file's filename, is marked as tainted; we then need to explicitly "clean" this data before using it. (If you try to use tainted data, Perl throws an error.) More on this in a moment.

2. Setting Safety Limits

In order to prevent the server being overloaded by huge file uploads, we'll limit the allowable size of an uploaded file to 5MB; this should be big enough to handle most digital photos:

$CGI::POST_MAX = 1024 * 5000;

We'll also create a list of "safe" characters for filenames. Some characters, such as slashes (/), are dangerous in filenames, as they might allow attackers to upload files to any directory they wanted. Generally speaking, letters, digits, underscores, periods, and hyphens are safe bets:

my $safe_filename_characters = "a-zA-Z0-9_.-";

3. The Upload Directory

We need to create a location on our server where we can store the uploaded files. We want these files (the photos) to be visible on our web site, so we should store them in a directory under our document root, for example:

my $upload_dir = "/home/mywebsite/htdocs/upload";

You'll need to create a directory called "upload" on your web site's document root, then set $upload_dir to the absolute path to that directory, as I've done above. Make sure your directory can be read and written to by your script; on a shared UNIX server, this usually means setting the mode to 777 (for example, by issuing the chmod 777 upload command at the command line). Check with your web hosting provider if you're not sure what you need to do.

4. Reading the Form Variables

The next step is to create a CGI object (we assign it to $query below); this allows us to access methods in the CGI.pm library. We can then read in the filename of our uploaded file, and the email address that the user entered into the form:

my $query = new CGI;
my $filename = $query->param("photo");
my $email_address = $query->param("email_address");

If there was a problem uploading the file -- for example, the file was bigger than the $CGI::POST_MAX setting -- $filename will be empty. We can test for this and report the problem to the user as follows:

if ( !$filename )
{
 print $query->header ( );
 print "There was a problem uploading your photo (try a smaller file).";
 exit;
}

5. Making the Filename Safe

We can't necessarily trust the filename that's been sent by the browser; an attacker could manipulate this filename to do nasty things such as upload the file to any directory on the Web server, or attempt to run programs on the server.

The first thing we'll do is use the fileparse routine in the File::Basename module to split the filename into its leading path (if any), the filename itself, and the file extension. We can then safely ignore the leading path. Not only does this help thwart attempts to save the file anywhere on the web server, but some browsers send the whole path to the file on the user's hard drive, which is obviously no use to us:

my ( $name, $path, $extension ) = fileparse ( $filename, '/..*' );
$filename = $name . $extension;

The above code splits the full filename, as passed by the browser, into the name portion ($name), the leading path to the file ($path), and the filename's extension ($extension). To locate the extension, we pass in the regular expression '/..*' -- in other words, a literal period (.) followed by zero or more characters. We then join the extension back onto the name to reconstruct the filename without any leading path.

The next stage in our quest to clean up the filename is to remove any characters that aren't in our safe character list ($safe_filename_characters). We'll use Perl's substitution operator (s///) to do this. While we're at it, we'll convert any spaces in the filename to underscores, as underscores are easier to deal within URLs:

$filename =~ tr/ /_/;
$filename =~ s/[^$safe_filename_characters]//g;

Finally, to make doubly sure that our filename is now safe, we'll match it against our $safe_filename_characters regular expression, and extract the characters that match (which should be all of them). We also need to do this to untaint the $filename variable. This variable is tainted because it contains potentially unsafe data passed by the browser. The only way to untaint a tainted variable is to use regular expression matching to extract the safe characters:

if ( $filename =~ /^([$safe_filename_characters]+)$/ )
{
 $filename = $1;
}
else
{
 die "Filename contains invalid characters";
}

(Note that the above die function should never be executed, because we've already removed our dodgy characters using the earlier substitution. However, it doesn't hurt to be cautious!)

6. Getting the File Handle

As I mentioned above, we can use the upload method to grab the file handle of the uploaded file (which actually points to a temporary file created by CGI.pm). We do this like so:

my $upload_filehandle = $query->upload("photo");

7. Saving the File

Now that we have a handle to our uploaded file, we can read its contents and save it out to a new file in our file upload area. We'll use the uploaded file's filename -- now fully sanitised -- as the name of our new file:

open ( UPLOADFILE, ">$upload_dir/$filename" ) or die "$!";
binmode UPLOADFILE;

while ( <$upload_filehandle> )
{
 print UPLOADFILE;
}

close UPLOADFILE;

Notice the die function at the end of the first line above; if there's an error writing the file, this function stops the script running and reports the error message (stored in the special variable $!). Meanwhile, the binmode function tells Perl to write the file in binary mode, rather than in text mode. This prevents the uploaded file from being corrupted on non-UNIX servers (such as Windows machines).

8. Thanking the User

We've now uploaded our file! The last step is to display a quick thank-you note to the users, and to show them their uploaded photo and email address:

print $query->header ( );
print <<END_HTML;
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 <head>
   <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
   <title>Thanks!</title>
   <style type="text/css">
     img {border: none;}
   </style>
 </head>
 <body>
   <p>Thanks for uploading your photo!</p>
   <p>Your email address: $email_address</p>
   <p>Your photo:</p>
   <p><img src="/upload/$filename" alt="Photo" /></p>
 </body>
</html>
END_HTML

The Finished Script

Your finished CGI script should look something like this:

#!/usr/bin/perl -wT

use strict;
use CGI;
use CGI::Carp qw ( fatalsToBrowser );
use File::Basename;

$CGI::POST_MAX = 1024 * 5000;
my $safe_filename_characters = "a-zA-Z0-9_.-";
my $upload_dir = "/home/mywebsite/htdocs/upload";

my $query = new CGI;
my $filename = $query->param("photo");
my $email_address = $query->param("email_address");

if ( !$filename )
{
 print $query->header ( );
 print "There was a problem uploading your photo (try a smaller file).";
 exit;
}

my ( $name, $path, $extension ) = fileparse ( $filename, '/..*' );
$filename = $name . $extension;
$filename =~ tr/ /_/;
$filename =~ s/[^$safe_filename_characters]//g;

if ( $filename =~ /^([$safe_filename_characters]+)$/ )
{
 $filename = $1;
}
else
{
 die "Filename contains invalid characters";
}

my $upload_filehandle = $query->upload("photo");

open ( UPLOADFILE, ">$upload_dir/$filename" ) or die "$!";
binmode UPLOADFILE;

while ( <$upload_filehandle> )
{
 print UPLOADFILE;
}

close UPLOADFILE;

print $query->header ( );
print <<END_HTML;
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 <head>
   <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
   <title>Thanks!</title>
   <style type="text/css">
     img {border: none;}
   </style>
 </head>
 <body>
   <p>Thanks for uploading your photo!</p>
   <p>Your email address: $email_address</p>
   <p>Your photo:</p>
   <p><img src="/upload/$filename" alt="Photo" /></p>
 </body>
</html>
END_HTML

Save this file on your hard drive, and call it upload.cgi.

Now we've created our server-side script, we can place both the script and the form on our server and test the file upload.

Putting It All Together

1. Place the Files on your Server

Place the HTML form somewhere under your Website's document root, and your CGI script in your Website's cgi-bin directory.

Note: don't forget to make the CGI script executable if you're on a UNIX server - chmod a+rx upload.cgi or chmod 755 upload.cgi

2. Set the Correct Paths and URLs

If necessary, change the upload.cgi URL in the <form> tag to point to the correct URL for the CGI script:

<form action="/cgi-bin/upload.cgi" method="post"  
enctype="multipart/form-data">

Also, don't forget to set the correct path to Perl in your CGI script, and the correct absolute path to the 'upload' directory that you created on your server:

my $upload_dir = "/home/mywebsite/htdocs/upload";

3. Test the Script

Let's try it out! Go to the URL of your file upload form on your server, select a photo to upload, and enter your email address:

Testing the script (click to view image)



Press the "Submit Form" button. If all goes well, the photo will be uploaded to the server, and you should see the "Thanks!" page, which also displays your photo and email address:

Viewing the thank you screen (click to view image)



Congratulations - you've written a file upload handler script!

If you get Internal Server Errors, double-check the permissions, paths and URLs described above, and look for other common CGI script pitfalls. For instance, editing a file on Windows and then uploading it to your Web server in Binary format will cause the script to crash on Unix servers.

Final Thoughts

A couple of points about this script are worth a mention:

If you were doing this on a real Website with lots of users, it would be a good idea to create a separate upload directory for each user, so that one user's photo won't be overwritten with another user's photo of the same name!

File upload isn't perfect. All browsers handle file uploads slightly differently, and some browsers can have trouble uploading files to certain types of servers and scripts. On the whole, though, most users won't have any problem with the most popular browsers.

That's it. Have fun with your file uploads!

 
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
相关文章推荐