Monday, October 26, 2009

10.44 Epilog




I l@ve RuBoard










10.44 Epilog



Recall the scenario with which this chapter began:



Suppose you have a file named
somedata.csv that
contains 12 columns of data in comma-separated values (CSV) format.
From this file you want to extract only columns 2, 11, 5, and 9, and
use them to create database records in a MySQL table that contains
name, birth,
height, and weight columns. You
need to make sure that the height and weight are positive integers,
and convert the birth dates from MM/DD/YY
format to CCYY-MM-DD format. How can you
do this?



So ... how would you do that, based on the
techniques discussed in this chapter?



Much of the work can be done using the utility programs developed
here. You can convert the file to tab-delimited format with
cvt_file.pl, extract the columns in the desired
order with yank_col.pl, and rewrite the date
column to ISO format with cvt_date.pl:



% cvt_file.pl --iformat=csv somedata.csv \
| yank_col.pl --columns=2,11,5,9 \
| cvt_date.pl --columns=2 --iformat=us --add-century > tmp


The resulting file, tmp, will have four columns
representing the name, birth,
height, and weight values, in
that order. It needs only to have its height and weight columns
checked to make sure they contain positive integers. Using the
is_positive_integer( ) library function from the
Cookbook_Utils.pm module file, that task can be
achieved using a short special-purpose script that
isn't much more than an input loop:



#! /usr/bin/perl -w
# validate_htwt.pl - height/weight validation example

# Assumes tab-delimited, linefeed-terminated input lines.

# Input columns and the actions to perform on them are as follows:
# 1: name; echo as given
# 2: birth; echo as given
# 3: height; validate as positive integer
# 4: weight; validate as positive integer

use strict;
use lib qw(/usr/local/apache/lib/perl);
use Cookbook_Utils;

while (<>)
{
chomp;
my ($name, $birth, $height, $weight) = split (/\t/, $_, 4);
warn "line $.:height $height is not a positive integer\n"
if !is_positive_integer ($height);
warn "line $.:weight $weight is not a positive integer\n"
if !is_positive_integer ($weight);
}

exit (0);


The validate_htwt.pl script
doesn't produce any output (except for warning
messages), because it doesn't need to reformat any
of the input values. Assuming that tmp passes
validation with no errors, it can be loaded into MySQL with a simple
LOAD DATA statement:



mysql> LOAD DATA LOCAL INFILE 'tmp' INTO TABLE  tbl_name ; 








    I l@ve RuBoard



    No comments: