CGI Basics
~~~~~~~~~~
	-by Will Stockwell (wald0@j00nix.org) 


* Introduction *
	
	Common Gateway Interface (CGI) provides a way for users running web
browsers to communicate with programs on the server through HTML documents.
Typical HTML documents don't change much on their own in the user's
browser (with the exception of Java, JavaScript, etc.). When a CGI
program is requested by a user, the corresponding program is executed on the 
server and sends dynamic output to the user's web browser. 
	The output can be anything from plain text to HTML to image data.
Any type of data that can be sent through a web browser can be generated by
a CGI program dynamically if the programmer has the know how. 

* HTML Forms *

	Nearly all CGI programs are interfaced with using HTML forms.
Therefore, a little HTML knowledge is needed to make use of CGI. Here's a
basic HTML form:

<html><head><title>HTML Form</title>
</head>
<body>
<h1>HTML Form</h1>
<p>Don't try it out yet. We need to add some server side CGI...</p>
<form action="cgiprog" method=get>
<p>First Name: <input type=text name="FirstName" size=30 maxlength=40></p>
<p>Last Name: <input type=text name="LastName" size=30 maxlength=40></p>
<p>Please select one: </p>
<input type=radio name=rb value=1>US Citizen<br>
<input type=radio name=rb value=2>International Citizen<br>
<p><input type=reset value="Clear"><input type=submit value="Send">
</form>
</body>
</html>

	That is a very basic form using only a few of the elements that can
be used in forms. Information on coding HTML forms in beyond this article's
scope. It's not difficult to find HTML information...so go have a look. 

* A Basic CGI Decoder *

	Now that we have a way for the user to send information to the
server, we need to make a program to receive this information and do
something with it. There is no special programming language for CGI. You can
use any program that executes from a shell (of course they will be coded
differently). Most of the CGI programs I write are in C and Perl. Here is an
example of a very basic Perl script decoder that just prints out the form 
data and some information about the server (this is taken from "Beginning
Linux Programming" by Neil Matthew and Richard Stones):

#!/usr/bin/perl
print "Content-type: text/plain\n\n";
read(STDIN, $input_buffer, $ENV{'CONTENT_LENGTH'});
@name_values = split(/&/, $input_buffer);
print "Name    Value\n";
foreach $name_value_pair (@name_values)
{
    ($name, $value) = split(/=/, $name_value_pair);
    $value =~ tr/+/ /;
    $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex(41))/eg;
    print "$name $value\n";
}

print "\n";

	The information read from a form is sent to the program as standard
input, so the script reads it like something from the keyboard. It then
splits the name=value strings up and prints them out. Also, there is nothing 
passed to argv or the query string. 
	Line 2 sends the content type required in all documents,
or an error occurs when a user requests that program. The content type may 
be different depending on what is being output. It may be 'text/plain' 
(like this program) if you are outputting plain text or 'text/html' for HTML 
documents or any number of others (graphics, sound files, whatever). No 
matter what, a content type followed by a blank line MUST be the first 
thing sent to the user's browser or an error will occur. 
	
	We must also change the initial declaration of our form in the HTML 
document to look like this:

<form action="/cgi-bin/cgi.sh" method=post>

	Two things are different here: the 'action' value and the 'method'
value. The 'action' is simply the path to the CGI program in the cgi-bin
directory (where all CGI programs should be) and the 'method' is changed to
POST so data is sent to the server, not just received. GET only gets output
from the program.

* Query Strings *

	I just said that you can't pass data to a CGI program if the form is
using the GET method, but you can use query strings. A query string is an
extension to the URL of a CGI program. One might look like this:

http://www.this-or-that.com/cgi-bin/login?user=joe&passwd=secret

	This might look pointless because the same can be done as easily
with forms and the POST method. There is one difference that makes this
useful in some situations. This URL can be bookmarked by the user and that
way he or she won't have to enter a username and password at every login.
Instead, they can simply bookmark this URL after typing it one time. This 
information shows up in the $QUERY_STRING environment variable and then it 
can be parsed like the data read from STDIN (standard input) in the earlier 
example. The bad thing about doing this is that the password is stored in 
plain text and anyone with access to your bookmark file can read it.

* CGI Returning HTML *

	As I said before, CGI programs can return just about any kind of
dynamic information. HTML is one of these things. Here is the same Perl
decoder as before, but it returns the information in HTML: 

#!/usr/bin/perl
print "Content-type: text/html\n\n";
read(STDIN, $input_buffer, $ENV{'CONTENT_LENGTH'});
@name_values = split(/&/, $input_buffer);
print "<html><head><title>CGI Example</title></head><body>\n";
print "<h1 align="center">Perl Decoder</h1>\n";
print "<h3>Name    Value</h3>\n";
foreach $name_value_pair (@name_values)
{
    ($name, $value) = split(/=/, $name_value_pair);
    $value =~ tr/+/ /;
    $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex(41))/eg;
    print "<p>$name $value</p>\n";
}

print "</body></html>\n";

	This only thing changed was the content-type and the addition of HTML
tags to format the document. Everything else is the same and it operates the
same way. If the content-type was kept at text/plain, you would see the tags
and it the document wouldn't be formatted.
	Sometimes, incorporating the HTML into the CGI is cumbersome and
unproductive. In a lot of cases, it is easier to redirect the user to an
HTML document rather than generating the document within the program. Here's
an example in C:

#include <stdio.h>

int main()
{
	printf("Status: 302\n");
	printf("Location: sendhere.html\n");
	return 0;
}

	This is self explanatory, but to be clear, this sends the status
code 302, which tells the browser to redirect to sendhere.html. Note that
there is no content-type in this output because it returns a status code
instead of document information. All status codes can be used in CGI, but
most really aren't useful. 

* Closing *

	Creating dynamic HTML is probably CGI's most useful characteristic.
CGI with HTML may even be more useful than HTML by itself because it allows
programmers to use the capabilities of an outside programming language as 
well as HTML. CGI is not only used by commercial websites; anyone can find it 
useful, even a personal webpage.  In addition, HTML markup can be used 
limitlessly at the programmer's discretion in CGI.  
