Wednesday, January 6, 2010

Section 10.4. Calling the OS








10.4. Calling the OS


Ruby supports a number of global functions for interacting with the operating system
to execute programs, fork new processes, handle signals, and so on. Ruby
was initially developed for Unix-like operating systems, and many of these
OS-related functions reflect that heritage. By their very nature, these
functions are less portable than most others, and some may not be
implemented at all on Windows and other non-Unix platforms. The
subsections that follow describe some of the most commonly used of the
OS-dependent functions. Functions, such a syscall, that
are particularly low-level or platform-dependent are not covered
here.



10.4.1. Invoking OS Commands


The Kernel.` method expects a single string
argument representing an OS shell command. It starts a subshell and
passes the specified text to it. The return value is the text printed to
standard output. This method is typically invoked using special syntax;
it is invoked on string literals surrounded by backquotes or on string
literals delimited with %x (see
Section 3.2.1.6). For example:


os = `uname`             # String literal and method invocation in one
os = %x{uname} # Another quoting syntax
os = Kernel.`("uname") # Invoke the method explicitly



This method does not simply invoke the specified executable; it
invokes a shell, which means that shell features such as filename
wildcard expansion are available:


files = `echo *.xml`



Another way to start a process and read its output is with the
Kernel.open function. This method is a variant on
File.open and is most often used to open files. (And
if you require 'open-uri' from the standard library,
it can also be used to open HTTP and FTP URLs.) But if the first
character of the specified "filename" is the pipe character
|, then it instead opens a pipe to read from and/or
write to the specified shell command:


pipe = open("|echo *.xml")
files = pipe.readline
pipe.close



If you want to invoke a command in a shell, but are not interested
in its output, use the Kernel.system method instead.
When passed a single string, it executes that string in a shell, waits
for the command to complete, and returns true on
success or false on failure. If you pass multiple
arguments to system, the first argument is the name
of the program to invoke, and remaining arguments are its command-line
arguments. In this case no shell expansion is performed on those
arguments.


A lower-level way to invoke an arbitrary executable is with the
exec function. This function never returns: it simply
replaces the currently running Ruby process with the specified
executable. This might be useful if you are writing a Ruby script that
is simply a wrapper to launch some other program. Usually, however, it
is used in conjunction with the fork function, which
is described in the next section.




10.4.2. Forking and Processes


Section 9.9 described Ruby's API for writing multithreaded programs. Another
approach to achieving concurrency in Ruby is to use multiple Ruby
processes. Do this with the fork function or its
Process.fork synonym. The easiest way to use this
function is with a block:


fork {
puts "Hello from the child process: #$$"
}
puts "Hello from the parent process: #$$"



When used this way, the original Ruby process continues with the
code that appears after the block and the new Ruby process executes the
code in the block.


When invoked without a block, fork behaves
differently. In the parent process, the call to fork
returns an integer which is the process ID of the newly created child
process. In the child process, the same call to fork
returns nil. So the previous code could also be
written like this:


pid = fork
if (pid)
puts "Hello from parent process: #$$"
puts "Created child process #{pid}"
else
puts Hello from child process: #$$"
end



One very important difference between processes and threads is
that processes do not share memory. When you call
fork, the new Ruby process starts off as an exact
duplicate of the parent process. But any changes it makes to the process
state (by altering or creating objects) are done in its own address
space. The child process cannot alter the data structures of the parent,
nor can the parent alter the structures seen by the child.


If you need your parent and child processes to be able to
communicate, use open, and pass
"|-" as the first argument. This opens a pipe to a
newly forked Ruby process. The open call yields to
the associated block in both the parent and the child. In the child, the
block receives nil. In the parent, however, an
IO object is passed to the block. Reading from this
IO object returns data written by the child. And data
written to the IO object becomes available for
reading through the child's standard input. For example:


open("|-", "r+") do |child|
if child
# This is the parent process
child.puts("Hello child") # Send to child
response = child.gets # Read from child
puts "Child said: #{response}"
else
# This is the child process
from_parent = gets # Read from parent
STDERR.puts "Parent said: #{from_parent}"
puts("Hi Mom!") # Send to parent
end
end



The Kernel.exec function is useful in
conjunction with the fork function or the
open method. We saw earlier that you can use the
` and system functions to send an
arbitrary command to the operating system shell. Both of those methods
are synchronous, however; they don't return until the command completes.
If you want to execute an operating system command as a separate
process, first use fork to create a child process,
and then call exec in the child to run the command. A
call to exec never returns; it replaces the current
process with a new process. The arguments to exec are
the same as those to system. If there is only one, it
is treated as a shell command. If there are multiple arguments, then the
first identifies the executable to invoke, and any remaining arguments
become the "ARGV" for that executable:


open("|-", "r") do |child|
if child
# This is the parent process
files = child.readlines # Read the output of our child
child.close
else
# This is the child process
exec("/bin/ls", "-l") # Run another executable
end
end



Working with processes is a low-level programming task and the
details are beyond the scope of this book. If you want to know more,
start by using ri to read about the other methods
of the Process module.




10.4.3. Trapping Signals


Most operating systems allow asynchronous signals to be sent to a running process.
This is what happens, for example, when the user types Ctrl-C to abort a
program. Most shell programs send a signal named
"SIGINT" (for interrupt) in response to Ctrl-C. And the default response to this
signal is usually to abort the program. Ruby allows programs to "trap"
signals and define their own signal handlers. This is done with the
Kernel.trap method (or its synonym
Signal.trap). For example, if you don't want to allow
the user to use Ctrl-C to abort:


trap "SIGINT" {
puts "Ignoring SIGINT"
}



Instead of passing a block to the trap method,
you can equivalently pass a Proc object. If you
simply want to silently ignore a signal, you can also pass the string
"IGNORE" as the second argument. Pass
"DEFAULT" as the second argument to restore the OS
default behavior for a signal.


In long-running programs such as servers, it can be useful to
define signal handlers to make the server reread its configuration file,
dump its usage statistics to the log, or enter debugging mode, for
example. On Unix-like operating systems, SIGUSR1 and
SIGUSR2 are commonly used for such purposes.




10.4.4. Terminating Programs


There are a number of related Kernel methods
for terminating program or performing related actions. The
exit function is the most straightforward. It raises
a SystemExit exception, which, if uncaught, causes
the program to exit. Before the exit occurs, however,
END blocks and any shutdown handlers registered with
Kernel.at_exit are run. To exit immediately, use
exit! instead. Both methods accept an integer
argument that specifies the process exit code that is reported to the
operating system. Process.exit and
Process.exit! are synonyms for these two
Kernel functions.


The abort function prints the specified error
message to the standard output stream and then calls
exit(1).


fail is simply a synonym for
raise, and it is intended for cases in which the
exception raised is expected to terminate the program. Like
abort, fail causes a message to be
displayed when the program exits. For example:


fail "Unknown option #{switch}"



The warn function is related to
abort and fail: it prints a
warning message to standard error (unless warnings have been explicitly
disabled with -W0). Note, however, that this function
does not raise an exception or cause the program to exit.


sleep is another related function that does not
cause the program to exit. Instead, it simply causes the program (or at
least the current thread of the program) to pause for the specified
number of seconds.










No comments: