Wednesday, December 30, 2009

Section 10.1. Invoking the Ruby Interpreter








Chapter 10. The Ruby Environment












This chapter is a catch-all for Ruby programming topics that have not been
discussed elsewhere. Most of the features covered here have to do with the
interface between Ruby and the operating system on which it is running. As
such, some of these features are OS-dependent. Similarly, many of the
features may be implementation dependent: not every Ruby interpreter will
implement them in the same way. Topics covered include:


  • The Ruby interpreter's command-line arguments and environment
    variables.

  • The top-level execution environment: global functions, variables,
    and constants.

  • Shortcuts for text processing scripts: global functions,
    variables, and interpreter options, usually inspired by the Perl
    programming language, that make it possible to write short but powerful
    Ruby programs for processing text files.

  • OS commands: running shell commands and invoking executables in
    the underlying operating system. These are features that allow Ruby to
    be used as a scripting or "glue" language.

  • Security: how to reduce the risk of SQL injection and similar
    attacks on with Ruby's tainting mechanism, and how to "sandbox"
    untrusted Ruby code with $SAFE execution levels.




10.1. Invoking the Ruby Interpreter


The standard C-based Ruby implementation is invoked from the command line like
this:


ruby [options] [--] program [arguments]



options is zero or more command-line arguments that affect the operation of
the interpreter. The legal arguments
are described shortly.


program is the name of the file that
holds the Ruby program to be run. If the name of the program begins with a
hyphen, precede it with -- to force it to be treated as
a program name rather than as an option. If you use a single hyphen as the
program name, or omit program and
arguments altogether, the interpreter will read
program text from standard input.


Finally, arguments is any number of
additional tokens on the command line. These tokens become the elements of
the ARGV array.


The subsections that follow describe the options supported by the
standard C-based Ruby implementation. Note that you may set the
RUBYOPT environment variable to include any of the
-W, -w, -v,
-d, -I, -r, and
-K options. These will automatically be applied to
every invocation of the interpreter, as if they were specified on the
command line.



10.1.1. Common Options


The following options are probably the most commonly used. Most
Ruby implementations can be expected to support these options or to
provide a work-alike alternative:





-w



This option enables warnings about deprecated or problematic code and
sets $VERBOSE to
true. Many Ruby programmers use this option
routinely to ensure that their code is clean.






-e
script



This option runs the Ruby code in script. If
more than one -e option is specified, their
associated scripts are treated as separate lines of code. Also, if
one or more -e option is
specified, the interpreter does not load or run any
program specified on the command
line.


To enable succinct one-liner scripts, Ruby code specified
with the -e option may use the
Regexp matching shortcut explained later in
this chapter.






-I
path



This option adds the directories in path to the beginning
of the global $LOAD_PATH array. This specifies
directories to be searched by the load and
require methods (but does not affect the
loading of the program specified on the
command line).


Multiple -I options may appear in the
command line and each may list one or more directories. If
multiple directories are specified with a single
-I option, they should be separated from each
other with : on Unix and Unix-like systems and
with ; on Windows systems.






-r
library



This option loads the specified library before running
the specified program. This option works as if the first line of
the program were:


require 'library'



The space between the -r and the name of
the library is optional and often omitted.






-rubygems



This frequently used command-line argument is not a true
option but simply a clever application of the
-r option. It loads the module named
ubygems (with no r) from the standard library.
Conveniently, the ubygems module simply loads
the real rubygems module. Ruby 1.9 can load
installed gems without this module, so this option is only
necessary in Ruby 1.8.






--disable-gems



This Ruby 1.9 option prevents the addition of gem
installation directories to the default load path. If you have
many gems installed, and you are running a program that does not
use those gems (or a program that explicitly manages its own
dependencies with the gem method), you may find
that your program startup time is reduced with this option.






-d




--debug



These options set the global variables $DEBUG and
$VERBOSE to true. Your
program, or library code, used by your program may print debugging
output or take other action when these variables are set.






-h



This option displays a list of interpreter options and
exits.






10.1.2. Warnings and Information Options


The following options control the type or the amount of
information the Ruby interpreter
displays:





-W




-W2




--verbose



These are all synonyms for -w: they
enable verbose warnings and set $VERBOSE to
true.






-W0



This option suppresses all warnings.






-v



This option prints the Ruby version number. If no program is specified, it
exits rather than reading a program from standard input. If a
program is specified, run it as if --verbose
(or -w) had been
specified.






--version




--copyright




--help



These options print Ruby version number, copyright information, or
command-line help and exit. --help is a synonym
for -h. --version differs
from -v in that it never runs a specified
program.






10.1.3. Encoding Options


The following options are used to specify the default
external encoding of the Ruby process and the default source encoding
for files that do not specify their own encoding with a coding comment.
If none of these options is specified, then the default external
encoding is derived from the locale and the default source encoding is
ASCII (see Section 2.4 for more on source
encoding and default external encoding):





-K
code



In Ruby 1.8, this option specifies the source encoding of
the script and sets the global variable $KCODE.
In Ruby 1.9, it sets the default external encoding of the Ruby
process and specifies a default source encoding.


Specify a code of
a, A, n,
or N for ASCII; u or
U for Unicode; e or
E for EUC-JP; and s or
S for SJIS. (EUC-JP and SJIS are common
Japanese encodings.)






-E
encoding




--encoding=
encoding



These options are like -K but allow
the encoding to be specified by name rather than by a one-letter
abbreviation.






10.1.4. Text Processing Options


The following options alter Ruby's default text processing behavior, or are
helpful when writing one-line scripts with the -e
option:





-0
xxx



This option is the digit 0, not the
letter O. xxx should be between zero
and three octal digits. When specified, these digits are the ASCII
code of the input record separator character and set the
$/ variable. This defines "a line" for
gets and similar methods. -0
by itself sets $/ to character code
0. -00 is special; it puts
Ruby into "paragraph mode" in which lines are separated by two
adjacent newline characters.






-a



This option automatically splits each line of input into fields and stores
the fields in $F. This option only works with
-n or -p looping options and
adds the code $F =
$_.split
at the start of each iteration. See also
-F.






-F
fieldsep



This option sets the input field separator $; to
fieldsep. This affects the behavior of
split when called with no arguments. See
-a.


fieldsep may be a single character or an
arbitrary regular expression, without the delimiting slashes.
Depending on your shell, you may need to quote or double the
backslashes in any regular expression specified on the command
line.





-i [ext]



This option edits the files specified on the command line in place. Lines
are read from the files specified on the command line, and output
goes back to those same files. If ext
is specified, a backup copy of the files is made, adding
ext to the filename.






-l



This option makes the output record separator $\
the same as the input record separator $/ (see
-0), so that that line ending is automatically
added to text output with print. This option is
intended for use with -p or
-n. When used with one of those options, it
automatically calls chop to remove the input
record separator from each line of input.






-n



This option runs the program as if it were enclosed in the following
loop:


while gets             # Read a line of input into $_
$F = split if $-a # Split $_ into fields if -a was specified
chop! if $-l # Chop line ending off $_ if -l was specified
# Program text here
end



This option works in Ruby 1.9 even though the global
functions chop! and split
are no longer available in that version of the language.


This option is often used with -e. See
also -p.






-p



This option runs the program as if it were written in the following loop:


while gets             # Read a line of input into $_
$F = split if $-a # Split $_ into fields if -a was specified
chop! if $-l # Chop line ending off $_ if -l was specified
# Program text here
print # Output $_ (adding $/ if -l was specified)
end



This option works in Ruby 1.9 even though the global
functions chop! and split
are no longer available in that version of the language.


This option is often used with -e. See
also -n.






10.1.5. Miscellaneous Options


The following options don't fit into any of the previous
categories:





-c



This option parses the program and report any syntax errors, but does not
run it.






-C
dir




-X
dir



These options change the current directory to dir
before running the program.






-s



When this option is specified, the interpreter
preprocesses any arguments that appear after the program name and
begin with a hyphen. For arguments of the form
-x=y, it sets $x to
y. For arguments of the form
-x, it sets $x to
true. The preprocessed arguments are removed
from ARGV.






-S



This option looks for the specified program file relative to the path
specified in the RUBY_PATH environment
variable. If it is not found there, it looks for it relative to
the PATH environment variable. And if it is
still not found, it looks for it normally.






-T
n



This option sets $SAFE to
n, or to 1 if
n is omitted. See Section 10.5 for more.





-x [dir]



This option extracts Ruby source from the program file by discarding any lines
before the first that starts #!ruby. For
compatibility with the capital -X option, this
option also allows a directory to be specified.












No comments: