|
|||||||||
|
|||||||||
Related Syntax | |||||||||
CGI programming with OmniMark |
The Common Gateway Interface (CGI) is a protocol that allows web servers to invoke and communicate with other programs. Through CGI, a web server can call a program, send input to that program, and receive output from that program sent to a web browser.
Writing OmniMark CGI programs is similar to writing other programs in OmniMark, with a few differences in how your program has to handle input and output data.
Before you can begin using OmniMark CGI programs, you may have to make some minor changes to your computer and web server setup. Information about configuring your system should be available in your operating system or web server documentation.
Running an OmniMark CGI program involves three basic steps:
The third step is, of course, the heart of your program. In OmniMark CGI programs, this third step is exactly the same as in any other type of OmniMark program. You can do anything with OmniMark in a CGI program that you can do in any other OmniMark program, including processing markup, using external function libraries, or interacting with databases.
Invoking an OmniMark CGI program
Before an OmniMark CGI program can receive input data from a web server, the web server has to be able to successfully invoke the OmniMark program. To do this, the web server has to know where it can find the OmniMark executable.
Some web servers (for example, Apache) require that all CGI programs begin with a (hash-bang) directive that tells the web server where to find the software with which to run your CGI program. In an OmniMark CGI program, this line must include the full path and name of your OmniMark executable, followed by a command-line option:
#!/usr/bin/omnimark/bin/omnimark -sb
The "-sb" command-line option is new to OmniMark 5, and is the functional equivalent of a "-s" combined with "-brief".
The command-line option in the line might look a little strange to people who are used to the OmniMark command line, since the -sb option isn't followed by a filename. This is because the
line implicitly refers to the file in which it occurs. For example, assuming that the program above is saved as helloworld.xom, your operating system will interpret the
directive in this program as:
#!/usr/bin/omnimark/bin/omnimark -sb helloworld.xom
Note that you can use a line in an arguments file as well:
#!/usr/bin/omnimark/bin/omnimark -f -sb helloworld.xom -x /usr/bin/omnimark/lib/=L.so -i /usr/bin/omnimark/xin/
The -f in the arguments file is interpreted the same way as the -sb in the program: the operating system interprets the -f as being followed by the name of the arguments file itself.
Using a line in OmniMark CGI programs does not reduce the portability of your code: OmniMark ignores the
line as if it were a comment, as do web servers that don't require it. Note, however, that if you use a
line, it must be the first line in the program. Having anything preceding the
line will produce an error.
Web servers that don't use the line must be specifically configured to recognize OmniMark CGI programs and to find the OmniMark executable. This is usually done by creating file associations so that the system uses OmniMark to execute .xom and .xar files. For example:
".xom" = "d:\programs\omnimark -sb %s" ".xar" = "d:\programs\omnimark -f %s"
Receiving input data
CGI programs receive their input data from the web server in two ways: through environment variables and through standard input. The means used to retrieve the main input data depends upon the method used to send that data, which will usually be either GET or POST.
When you specify the GET method, the web server puts the main input data into an environment variable called QUERY_STRING. When you specify the POST method, the web server sends the main input data to the CGI program through standard input.
The method your OmniMark program uses when retrieving GET data will differ from the method used when retrieving POST data. Using the OmniMark CGI library, however, you can easily create an OmniMark CGI program that will successfully retrieve data sent by either of these methods.
Using the OmniMark CGI library
The OmniMark CGI library contains two functions and a macro:
cgiGetQuery
function
cgiGetEnv
function
crlf
macro
The cgiGetQuery
function retrieves the data that the web server sends to your OmniMark CGI program, parses it, and puts the data on a keyed shelf of name/value pairs. The cgiGetEnv
function retrieves the values of a variety of CGI-related environment variables and puts the data on a keyed shelf of name/value pairs. The CRLF macro allows you to easily use %13#%10#
instead of a %n
to insert a new line in your program output. Use %13#%10#
instead of %n
to ensure the portability of your code.
Because the OmniMark CGI function library uses functions in the OmniMark System Utilities library ("omutil"), you must declare and include the System Utilities library before including the "omcgi.xin" file.
Here's an example of the cgiGetQuery
function in action:
declare #process-input has unbuffered declare #process-output has binary-mode include "omutil.xin" include "omcgi.xin" process local stream input-data variable initial-size 0 cgiGetQuery into input-data output "Content-type: text/plain" || crlf || crlf repeat over input-data output key of input-data || " - " || input-data || crlf again
When called by a web server, the above program retrieves the query string from the QUERY_STRING environment variable if the GET method was used to send the data, or from #process-input
if the POST method was used. The program parses the query string and puts the name/value pairs on the input-data shelf. The program then outputs a minimal HTTP header (Content-type: text/plain
), and repeats over the input-data shelf, outputting the name/value pairs.
Note that in the program, #process-input
is declared as unbuffered
. Under normal circumstances, all OmniMark streams are buffered. When doing CGI programming, however, this buffering can cause endless amounts of trouble when you're trying to get your input data. If #process-input
is buffered, your OmniMark program will never be able to get all the data it's waiting for. Therefore, you always have to tell your program to unbuffer #process-input
. Do this with a declaration at the beginning of the program:
declare #process-input has unbuffered
The cgiGetQuery
example shown above is an extremely simple CGI program. It does, however, do all the essential things that any OmniMark CGI program must do: it retrieves and parses the input data sent by the web server, and it sends a minimal HTTP header to the web server prior to sending the main bulk of the output.
Sending output data
Data written by your OmniMark CGI program is sent to standard output (which is the default #process-output
stream) and is then relayed to the web browser. For the web browser to properly format the output, however, your program must output a minimal HTTP header before outputting the data to be displayed.
Setting #process-output
to binary-mode
ensures that the system running your CGI program will properly interpret the %13#%10#
of the CRLF macro. This ensures that your code is portable among systems.
Declare the #process-output
stream as binary-mode
with the following declaration:
declare #process-output has binary-mode
Formatting output data
Here's a simple OmniMark CGI program:
; declarations and inclusions declare #process-input has unbuffered declare #process-output has binary-mode include "omutil.xin" include "omcgi.xin" process output "Content-type: text/plain" || crlf || crlf || "Hello World!" || crlf
Assuming that this program is saved as "helloworld.xom" and has an accompanying arguments file saved as "helloworld.xar", you can invoke the program by using the path and name of the arguments file in a URL. For example:
http://localhost/cgi-bin/helloworld.xar
If the web server is properly configured, it will receive the request for the helloworld.xar file which will then call the helloworld.xom program. The program will execute, and the output (an HTTP header followed by "Hello World!") will be sent to the browser. The browser, in turn, will display that output as plaintext ASCII.
Before you send the main content of the page to the browser, you have to send an HTTP header. In most cases, a very minimal HTTP header will suffice, so long as it contains the content-type information for the page you are sending. The two most common types of page content are plaintext and HTML, the minimal HTTP headers for which are:
"Content-type: text/plain" || CRLF || CRLF "Content-type: text/html" || CRLF || CRLF
Notice the two new lines (|| crlf || crlf
) appended to the end of each of the HTTP header lines above. These new lines are required, because the blank line following the HTTP header indicates to the web browser that the HTTP header is complete, and that everything that follows is part of the page content. If you forget these new lines when sending output to the browser, the web browser will attempt to interpret all of the output as part of the header, which will result in an error.
Other than sending a minimal HTTP header followed by two new line characters, there is nothing special about the output of an OmniMark CGI program. Anything your program sends to standard output (#process-output
, which is the default output destination) will be sent to the web browser.
Unbuffering #process-output
While you don't have to declare #process-output
as unbuffered
in your OmniMark CGI programs, it can sometimes be a good idea to do so. If #process-output
is unbuffered, users can see responses from your CGI program a little more quickly than if #process-output
is buffered. In most cases the change in response time is negligible. If your CGI program executes a large number of database queries or some particularly time-consuming processing, however, unbuffering #process-output
can reduce the perceived "wait time" for the user, and your CGI program will seem more responsive. Again, the change in response time is often negligible, because OmniMark CGI programs tend to be extremely fast.
You can unbuffer #process-output
with the following declaration:
declare #process-output has unbuffered
Error message handling
Dealing with error messages that OmniMark CGI programs generate is a bit more complicated than debugging other OmniMark programs.
When you execute a regular OmniMark program on the command line, any errors that program generates are sent to standard error and displayed in the console window. Since OmniMark CGI programs are executed by the web server rather than through the command line, the error messages are sent back to the web server and usually end up being written in the web server error log file.
If your OmniMark CGI program has errors, the HTTP header that the web browser is expecting doesn't get sent. Instead, the web server receives one or more OmniMark error messages. Since an OmniMark error message does not qualify as a valid HTTP header, the "header" the web server receives (which is actually the OmniMark error message) gets recorded in the server error log as part of a "malformed header" error. The web browser in this case will usually display an HTTP 500 error, indicating an internal server error. To see the error messages, you'll have to open and read the server error log.
If you don't want to tackle the web server error log or if the OmniMark error messages aren't being written to it, you can create an error log for your OmniMark CGI program using the -log or -alog option in the arguments file:
#!/usr/bin/omnimark/bin/omnimark -f -sb helloworld.xom -alog helloworld.log
All error messages that OmniMark generates will be recorded in the file specified after the -log or -alog option. If errors occur in your program, your web browser will display a CGI error message that your CGI program returned an incomplete set of HTTP headers. This occurs because the web browser didn't actually receive anything; the program output (the error messages in this case) was sent to the log file instead.
Note that the -log and -alog command-line options should be used to create a log file only for debugging purposes. If you use these options in your CGI program when it is running in a production environment, your program could encounter concurrency problems if two instances of the CGI program are trying to write to the same log file simultaneously. To avoid these problems, stop using the -log or -alog option after you have finished debugging your program.
CGI-related environment variables
When a web server receives a request for a CGI program, it also stores other CGI-related information in environment variables. You can access these environment variables using the UTIL_GetEnv
function in the OmniMark System Utilities library ("omutil"). Not all web servers will set all environment variables. You can use the cgiGetEnv
function to retrieve all of the following environment variable values into a keyed shelf of name/value pairs:
Related Syntax #! cgiGetEnv cgiGetEnv (CGI Testing library) cgiGetQuery cgiGetQuery (CGI Testing library) |
---- |