Friday, December 4, 2009

Chapter 18. Perl API







Chapter 18. Perl API

The easiest method of connecting to MySQL with the programming
language Perl is to use the Perl DBI module, which is part of the core Perl
installation. You can download both Perl and the DBI module from CPAN
(http://www.cpan.org). I wrote this chapter with the
assumption that the reader has Perl installed along with
DBI.pm and that the reader has a basic knowledge of
Perl. Its focus, therefore, is on how to connect to MySQL, run SQL
statements, and effectively retrieve data from MySQL using Perl and DBI.
This chapter begins with a tutorial on using Perl with MySQL. That's
followed by a list of Perl DBI methods and functions used with MySQL, with
the syntax and descriptions of each and examples for most. The examples here use the scenario of a
bookstore's inventory.


18.1. Using Perl DBI with MySQL

This section presents basic tasks that you can perform with Perl
DBI. It's meant as a simple tutorial for getting started with the Perl DBI
and MySQL.

18.1.1. Connecting to MySQL

To interface with MySQL, first you must call the DBI module
and then connect to MySQL. To make a connection to the
bookstore database using the Perl DBI, only the
following lines are needed in a Perl program:

#!/usr/bin/perl -w
use strict;

use DBI;

my $dbh = DBI->connect ("DBI:mysql:bookstore:localhost","russell",
"my_pwd1234")
or die "Could not connect to database: "
. DBI->errstr;


The first two lines start Perl and set a useful condition for
reducing programming errors (use strict). The third
line calls the DBI module. The next statement (spread over more than one
line here) sets up a database handle that specifies the database engine
(mysql), the name of the database
(bookstore), the hostname
(localhost), the username, and the password.
Incidentally, the name of the database handle doesn't have to be called
$dbh—anything will do. Next, the
or operator provides alternate instructions to be
performed if the connection fails. That is, the program will terminate
(die) and then display the message in quotes along
with whatever error message is generated by the driver using the
errstr method from the DBI—the dot
(.) merges them together.

18.1.2. Executing an SQL Statement

Making a connection to MySQL does little good unless an SQL
statement is executed. Any SQL statement that can be entered from the
mysql client can be executed through the API.
Continuing the previous example and using a fictitious database of a
bookstore, let's look at how an SQL statement that retrieves a list of
books and their authors from a table containing that information might
look:

my $sql_stmnt = "SELECT title, author FROM books";

my $sth = $dbh->prepare($sql_stmnt);

$sth->execute();


The first line sets up a variable ($sql_stmnt)
to store the SQL statement. The next line puts together the database
handle created earlier and the SQL statement to form the SQL statement
handle ($sth). Finally, the third line executes the
statement handle in the notational method of the DBI module.

18.1.3. Capturing Data

Having connected to MySQL and invoked an SQL statement, what
remains is to capture the data results and to display them. MySQL
returns the requested data to Perl in columns and rows, as it would with
the mysql client, but without table formatting. In
Perl, MySQL returns rows one at a time and they are usually processed by
a loop in Perl. Each row is returned as an array, one element per column
in the row. For each array, each element can be parsed into variables
for printing and manipulation before receiving or processing the next
row. You can do this with a while statement like
so:

while (my($title, $author) = $sth->fetchrow_array()) { 
print "$title ($author) \n";
}


At the core of this piece of code is the
fetchrow_array⁠(⁠ ⁠ ⁠) method belonging to the DBI
module. As its name suggests, it fetches each row or array of columns,
one array at a time. The while statement executes its
block of code repeatedly so long as there are arrays to process. The
value of each element of each array is stored in the two variables
$title and
$author—and overwritten with each loop. Then the
variables are printed to the screen with a newline character after each
pair.

18.1.4. Disconnecting from MySQL

Once there is no longer a need to maintain a connection to the
MySQL database, it should be terminated. If the connection stays idle
for too long, MySQL will eventually break the connection on its own. To
minimize the drain on system resources, however, it's a good practice to
have programs end their sessions like so:

$sth->finish();
$dbh->disconnect();

exit();


This first line closes the SQL statement handle. As long as the
connection to MySQL is not broken, as it will be in the second line,
more SQL statement handles could be issued, prepared, and executed
without having to reconnect to MySQL. The last line of code here ends
the Perl program.

18.1.5. Temporarily Storing Results

Perhaps a method of retrieving data from MySQL that's cleaner than
the one just explained involves capturing all of the data in memory for
later use in a program, thus allowing the connection to MySQL to end
before processing and displaying the data. Putting MySQL on hold while
processing each row as shown earlier can slow down a program, especially
when dealing with large amounts of data. It's sometimes better to create
a complex data structure (an array of arrays) and then leave the data
structure in memory, just passing around a reference number to its
location in memory. To do this,
instead of using fetchrow_array⁠(⁠ ⁠ ⁠), you'd use
the fetchall_arrayref⁠(⁠ ⁠ ⁠) method. As the method's
name indicates, it fetches all of the data at once, puts it into an
array (an array of rows of data), and returns the array's starting
location in memory. Here is a Perl program that uses
fetchall_arrayref⁠(⁠ ⁠ ⁠):

#!/usr/bin/perl -w
use strict;
use DBI;

# Connect to MySQL and execute SQL statement
my $dbh = DBI->connect("DBI:mysql:bookstore:localhost",
"username","password")
|| die "Could not connect to database: "
. DBI->errstr;

my $sql_stmnt = "SELECT title, author
FROM books";
my $sth = $dbh->prepare($sql_stmnt);
$sth->execute();

# Retrieve reference number to results
my $books = $sth->fetchall_arrayref();

$sth->finish();
$dbh->disconnect();

# Loop through array containing rows (arrays)

foreach my $book (@$books){
# Parse each row and display
my ($title, $author) = @$book;
print "$title by $author\n";
}

exit();



Instead of embedding the fetch method within a flow control
statement, the results of the SQL statement using
fetchall_arrayref⁠(⁠ ⁠ ⁠) are stored in memory. A
reference number to the location of those results is stored in the
$books variable and the connection to MySQL is then
closed. A foreach statement is employed to extract
each reference to each array (i.e., each row, each
$book) of the complex array. Each record's array is
parsed into separate variables ($title and
$author). The values of the variables are displayed
using print. Incidentally, to learn more about
references, see Randal Schwartz's book, Intermediate
Perl
(O'Reilly).

This kind of batch processing of an SQL statement has the added
advantage of allowing multiple SQL statements to be performed without
them tripping over each other, while still performing complex queries.
For instance, suppose that we want to get a list of books written by
Henry James, ordered by title, then by publisher, and then by year. This
is easy enough in MySQL. Suppose that we also want the inventory count
of each title, bookstore by bookstore, with some address information to
be displayed between the listing for each store. This becomes a little
complicated. One way to do this is to use a SELECT
statement that retrieves a list of store locations and their relevant
information (i.e., their addresses and telephone numbers) and to save a
reference to the data in memory. Next, we could issue another SQL
statement to retrieve the book inventory data, and then close the MySQL
connection. With a flow control statement, we could then print a store
header followed by the store's relevant inventory information for each
book before moving on to the next store. It would basically look like
this:

...  # Start program and connect to MySQL

# Retrieve list of stores
my $sql_stmnt = "SELECT store_id, store_name,
address, city, state, telephone
FROM stores";
my $sth = $dbh->prepare($sql_stmnt);
$sth->execute();
my $stores = $sth->fetchall_arrayref();
$sth->finish();

# Retrieve list of books
my $sql_stmnt = "SELECT title, publisher,
pub_year, store_id, quantity
FROM books
JOIN inventory USING(book_id)
WHERE author = 'Henry James'
ORDER BY title, publisher, pub_year";
my $sth = $dbh->prepare($sql_stmnt);
$sth->execute();
my $books = $sth->fetchall_arrayref();
$sth->finish();
$dbh->disconnect();

foreach my $store (@$stores){

my ($store_id, $store_name, $address,
$city, $state, $telephone) = @$store;

print "$store_name\n
$address\n$city, $state\n
$telephone\n\n";

foreach my $book (@$books){

my ($title, $publisher,
$pub_year, $store, $qty) = @$book;

if($store ne $store_id) { next; }

print "$title ($publisher $pub_year) $qty\n";
}
}

exit();



To save space, I left out the opening lines for the program
because they are the same as in the previous program. In the first SQL
statement here, we're selecting the store information. With the
fetchall_arrayref⁠(⁠ ⁠ ⁠) method, we're storing the
reference for the data in $stores. If we were to
print out this variable, we would see only a long number and not the
actual data. Although an SQL statement may retrieve many rows of data,
all of the data will be stored in memory. Therefore, we can issue
finish⁠(⁠ ⁠ ⁠), and as long as
we don't disconnect from MySQL, we can issue another SQL statement. The
next SQL statement selects the book inventory information. In the
SELECT statement we're hardcoding in the author's
name. We really should replace that with a variable (e.g.,
$author) and allow the user to set the variable
earlier in the program. Once the book inventory information has been
collected, the connection to MySQL is terminated and we can begin
displaying the data with the use of flow control statements.

The first foreach statement loops through the
data of each store and prints the address information. Within each loop
is another foreach loop for processing all of the
titles for the particular store. Notice the if
statement for the book inventory loop. The first record or array for the
first store is read and the basic store information is displayed. Then
the first array for the inventory is retrieved from its complex array
and the elements parsed into variables. If store
(which is the store_id) doesn't match the one that
it's on, Perl moves on to the next record. The result is that a store
header is displayed and all of the inventory information requested is
displayed for the store before Perl goes on to the next store's
data.

You can accomplish this task in many ways—some simpler and some
tighter—but this gives you a general idea of how to perform it, without
keeping the connection to MySQL open while processing data. For more
details on using the Perl DBI with MySQL, see Alligator Descartes and
Tim Bunce's book, Programming the Perl DBI
(O'Reilly).








No comments: