Friday, November 6, 2009

Recipe 14.16. Writing a CGI Script










Recipe 14.16. Writing a CGI Script




Credit: Chetan Patil



Problem


You want to expose Ruby code through an existing web server, without having to do any special configuration.




Solution


Most web servers are set up to run CGI
scripts, and it's easy to write CGI scripts in Ruby. Here's a simple CGI script that calls the Unix command ps, parses its results, and outputs the list of running processes as an HTML document.[6] Anyone with access to the web server can then look at the processes running on the system.

[6] On Windows, you could do this example by running some other command such as dir, listing the running Windows services as seen in Recipe 23.2, or just printing a static message.



#!/usr/bin/ruby
# ps.cgi

processes = %x{ps aux}.collect do |proc|
'<tr><td>' + proc.split(/\s+/, 11).join('</td><td>') + '</td></tr>'
end

puts 'Content-Type: text/html'
# Output other HTTP headers here…
puts "\n"

title = %{Processes running on #{ENV['SERVER_NAME'] || `hostname`.strip}}
puts <<-end
<HTML>
<HEAD><TITLE>#{title}</TITLE></HEAD>
<BODY>
<H1>#{title}</H1>
<TABLE>
#{processes.join("\n")}
</TABLE>
</BODY>
</HTML>
end

exit 0





Discussion


CGI was the first major technology to add dynamic elements to the previously static Web. A CGI resource is requested like any static HTML document, but behind the scenes the web server executes an external program (in this case, a Ruby
script) instead of serving a file. The output of the programtext, HTML, or binary datais sent as part of the HTTP response to the browser.


CGI has a very simple interface, based on environment variables and standard input and output; one that should be very familiar to writers of command-line programs. This simplicity is CGI's weakness: it leaves too many things undefined. But when a Rails application would be overkill, a CGI script might be the right size.


CGI programs typically reside in a special directory of the web server's web space (often the /cgi-bin directory). On Unix systems, CGI files must be made executable by the web server, and the first line of the script must point to the system's Ruby interpreter (usually /usr/bin/ruby or /usr/local/bin/ruby).


A
CGI
script gets most of its input from environment variables like QUERY_STRING and PATH_INFO, which are set by the web server. The web server also uses environment variables to tell the script where and how it's being run: note how the sample script uses ENV['SERVER_NAME'] to find the machine's hostname for display.


There are only a few restrictions on the output of a
CGI script. Before the "real" output, you need to send some HTTP headers. The only required header is Content-Type, which tells the browser what MIME type to expect from the document the CGI is going to output. This is also your chance to set other HTTP headers, such as Contentlength, Expires, Location, Pragma, and Status.


The headers are separated from the content by a blank line. If the blank line is missing, the server may incorrectly interpret the entire data stream as a HTTP headera leading cause of errors. Other possible problems include:


  • The first line of the file contains the wrong path to the Ruby executable.

  • The permissions on the CGI script don't allow the web server to access or execute it.

  • You used binary mode FTP to upload the script to your server from another platform, and the server doesn't understand that platform's line endings: use text mode FTP instead.

  • The web server is not configured to run Ruby
    scripts as CGI, or to run CGI scripts at all.

  • The script contains a compile error. Try running it manually from the command line.


If you get the dreaded error "premature end of script headers" from your web server, these issues are the first things to check.


Newer versions of Ruby include the CGI support library cgi. Except for extremely simple CGIs, it's better to use this library than to simply write HTML to standard output. The CGI class makes it easy to retrieve HTTP request parameters and to manage cookies. It also provides custom methods for generating HTML, using Ruby code that has the same structure as the eventual output.


Here's the code from ps.cgi, rewritten to use the CGI class. Instead of writing HTML, we make the CGI class do it. CGI also takes care of the content type, since we're using the default (text/html).



#!/usr/bin/ruby
# ps2.cgi

require 'cgi'

# New CGI object
cgi = CGI.new('html3')
processes = `ps aux`.collect { |proc| proc.split(/\s+/, 11) }

title = %{Processes running on #{ENV['SERVER_NAME'] || %x{hostname}.strip}}

cgi.out do
cgi.html do
cgi.head { cgi.title { title } } + cgi.body do
cgi.table do
(processes.collect do |fields|
cgi.tr { fields.collect { |field| cgi.td { field } }.join " " }
end).join "\n"
end
end
end
end

exit 0



Since CGI allows any user to execute an external CGI program on your web server, security is of paramount importance. Popular CGI hacks include corrupting the program's input by inserting special characters in the QUERY_STRING, stealing confidential user data by modifying the parameters posted to the CGI program, and launching denial-of-service attacks to render the web server inoperable. CGI programs need to be carefully inspected for possible bugs and exploits. A few simple techniques will improve your security: call taint on external data, set your $SAFE variable to 1 or higher, and don't use methods like eval, system, or popen unless you have to.




See Also


  • The CGI documentation (http://hoohoo.ncsa.uiuc.edu/cgi/), especially the list of environment variables (http://hoohoo.ncsa.uiuc.edu/cgi/env.html)

  • Recipe 14.17, "Setting
    Cookies and Other
    HTTP Response Headers"

  • Recipe 14.18, "Handling File Uploads via CGI"

  • Chapter 15













No comments: