21.2. Reading and Writing FilesThe UTL_FILE UTL_FILE lets you read and write files accessible from the server on which your database is running. So you could theoretically use UTL_FILE to write right over your tablespace datafiles, control files, and so on. That is of course a very bad idea. Server security requires the ability to place restrictions on where you can read and write your files. UTL_FILE implements this security by limiting access to files in one of two ways:
The next two sections explain how to use these two approaches; we will then examine the specific capabilities of the UTL_FILE package 21.2.1. The UTL_FILE_DIR ParameterWhen you call FOPEN to open a file, you must specify both the location and the name of the file in separate arguments. This file location is then checked against the list of accessible directories. Here's the format of the parameter for file access in the database initialization file:
Include a parameter for UTL_FILE_DIR for each directory you want to make accessible for UTL_FILE
To bypass server security and allow read/write access to all directories, you can use this special syntax:
You should not use this option on production systems. In a development system, this entry certainly makes it easier for developers to get up and running on UTL_FILE and test their code. However, you should allow access to only a few specific directories when you move the application to production. 21.2.1.1 Setting up directoriesHere are some observations on working with and setting up accessible directories with UTL_FILE:
21.2.1.2 Specifying file locationsThe location of the file is an operating system-specific string that specifies the directory or area in which to open the file. The location you provide must have been listed as an accessible directory in the INIT.ORA file for the database instance. The INIT.ORA location is a valid directory or area specification, as shown in these examples:
Notice that in Windows, the backslash character (\) is used as a delimiter. In Unix, the forward slash (/) is the delimiter. When you pass the location in the call to UTL_FILE.FOPEN, you provide the location specification as it appears in the INIT.ORA file (unless you just provided * for all directories in the initialization file). And remember that in case-sensitive operating systems, the case of the location specification in the initialization file must match that used in the call to UTL_ FILE.FOPEN. Here are some examples:
Your location must be an explicit, complete path to the file. You cannot use operating system-specific parameters such as environment variables in Unix to specify file locations. 21.2.2. Work with Oracle DirectoriesPrior to Oracle9i Database Release 2, whenever you opened a file, you needed to specify the location of the file, as in:
Such a hardcoding of values is always to be avoided, however. What if the location of the accounts data changes? How many programs will I have to go fix to make sure everyone is looking in the right place? How many times will I have to make such changes? A much better approach is to declare a variable or constant and assign it the value of the location. If you do this in a package, the constant can be referenced by any program in a schema with the EXECUTE privilege on that package. Here is an example, followed by a recoding of the earlier FOPEN call:
That's great. But even better is to use a schema-level object that you can define in the database: a directory. This particular type of object is also used when working with BFILEs, so you can in effect "consolidate" file location references in both DBMS_LOB and UTL_FILE by shifting to directories. To create a directory, you will need the CREATE ANY DIRECTORY
Here are some things to keep in mind about directories and UTL_FILE:
One advantage of building utilities like those found in fileIO is that you can easily add sophisticated handling of the case of the directory to avoid "formatting errors," such as forgetting to specify the directory name in uppercase. 21.2.3. Open FilesBefore you can read or write a file, you must open it. The UTL_FILE.FOPEN function opens the specified file and returns a file handle you can then use to manipulate the file. Here's the header for the function:
Parameters are summarized in the following table:
You can open the file in one of three modes:
Keep the following points in mind as you attempt to open files:
The following example shows how to declare a file handle and then open a configuration file for that handle in read-only mode:
Notice that I did not provide a maximum line size when I opened this file. That parameter is, in fact, optional. If you do not provide it, the maximum length of a line you can read from or write to the file is approximately 1,024. Given this limitation, you should always include the max_linesize argument as shown below:
21.2.4. Is the File Already Open?The IS_OPEN function returns TRUE if the specified handle points to a file that is already open. Otherwise, it returns false. The header for the function is,
where file is the file to be checked. Within the context of UTL_FILE, it is important to know what this means. The IS_ OPEN function does not perform any operating system checks on the status of the file. In actuality, it merely checks to see if the id field of the file handle record is not NULL. If you don't play around with these records and their contents, this id field is set to a non-NULL value only when you call FOPEN. It is set back to NULL when you call FCLOSE. 21.2.5. Close FilesUse the UTL_FILE.FCLOSE and UTL_FILE.FCLOSE_ALL procedures to close a specific file and all open files in your session, respectively. Use FCLOSE to close an open file. The header for this procedure is:
where file is the file handle. Notice that the argument to UTL_FILE.FCLOSE is an IN OUT parameter because the procedure sets the id field of the record to NULL after the file is closed. If there is buffered data that has not yet been written to the file when you try to close it, UTL_FILE will raise the WRITE_ERROR exception. FCLOSE_ALL closes all the opened files. The header for this procedure follows:
This procedure will come in handy when you have opened a variety of files and want to make sure that none of them are left open when your program terminates. In programs in which files have been opened, you may wish to call FCLOSE_ALL in the exception handlers of those programs. If there is an abnormal termination of the program, files will then still be closed.
When you close your files with the FCLOSE_ALL procedure, none of your file handles will be marked as closed (the id field, in other words, will still be non-NULL). The result is that any calls to IS_OPEN for those file handles will still return TRUE. You will not, however, be able to perform any read or write operations on those files (unless you reopen them). 21.2.6. Read from FilesThe UTL_FILE.GET_LINE procedure
Parameters are summarized in the following table:
The variable specified for the buffer parameter must be large enough to hold all the data up to the next carriage return or end-of-file condition in the file. If not, PL/SQL will raise the VALUE_ERROR exception. The line terminator character is not included in the string passed into the buffer.
Here is an example that uses GET_LINE:
Because GET_LINE reads data only into a string variable, you will have to perform your own conversions to local variables of the appropriate datatype if your file holds numbers or dates. 21.2.6.1 GET_LINE exceptionsWhen GET_LINE attempts to read past the end of the file, the NO_DATA_FOUND exception is raised. This is the same exception that is raised when you:
If you are performing more than one of these operations in the same PL/SQL block, you may need to add extra logic to distinguish between the different sources of this error. See the who_did_that.sql file on the book's web site for a demonstration of this technique. 21.2.6.2 Handy encapsulation for GET_LINEThe GET_LINE procedure is simple and straightforward. It gets the next line from the file. If the pointer to the file is already located at the last line of the file, UTL_ FILE.GET_LINE does not return any kind of flag but instead raises the NO_DATA_FOUND exception. This design leads to poorly structured code; you might consider using an encapsulation on top of GET_LINE to improve that design. A demonstration will make the point clear. Here is a program that reads each line from a file and then processes that line:
Notice that the simple loop does not contain any explicit EXIT statement. The loop terminates implicitly and with an exception, as soon as UTL_FILE reads past the end of the file. In a small block like this one, the logic is clear. But imagine if your program is hundreds of lines long and much more complex. Suppose further that reading the contents of the file is just one step in the overall algorithm. If an exception terminates my block, I will then need to put the rest of my business logic in the exception section (bad idea) or put an anonymous BEGIN-END block wrapper around my read-file logic. I am not comfortable with this approach. I don't like to code infinite loops without an EXIT statement; the termination condition is not structured into the loop itself. Furthermore, the end-of-file condition is not really an exception; every file, after all, must end at some point. Why must we be forced into the exception section simply because we want to read a file in its entirety? I believe that a better approach to handling the end-of-file condition is to build a layer of code around GET_LINE that immediately checks for end-of-file and returns a Boolean value (TRUE or FALSE). Theget_nextline procedure shown below demonstrates this approach:
The get_nextline procedure accepts an already assigned file handle and returns two pieces of information: the line of text (if there is one) and a Boolean flag (set to TRUE if the end-of-file is reached, FALSE otherwise). Using get_nextline, I can now read through a file with a loop that has an EXIT statement:
With get_nextline, I no longer treat end-of-file as an exception. I read a line from the file until I am done, and then I close the file and exit. This is, I believe, a more straightforward and easily understood program. 21.2.7. Write to FilesIn contrast to the simplicity of reading from a file, UTL_FILE offers a number of different procedures you can use to write to a file:
You can use these procedures only if you have opened your file with modes Wor A; if you opened the file for read-only, the runtime engine raises the UTL_ FILE.INVALID_OPERATION exception.
Let's take a closer look at UTL_FILE.PUT_LINE. This procedure writes data to a file and then immediately appends a newline character after the text. Here's the header for PUT_LINE:
Parameters are summarized in the following table:
Before you can call UTL_FILE.PUT_LINE, you must have already opened the file. Here is an example that uses PUT_LINE to dump the names of all our employees to a file:
A call to PUT_LINE is equivalent to a call to PUT followed by a call to NEW_LINE. It is also equivalent to a call to PUTF with a format string of "%s\n" (see the description of PUTF in the next section). 21.2.7.1 Writing formatted text to fileLike PUT, PUTF puts data into a file, but it uses a message format (hence, the "F" in "PUTF") to interpret the different elements to be placed in the file. You can pass between one and five different items of data to PUTF. Here's the specification:
Parameters are summarized in the following table:
The format string allows you to substitute the argN values directly into the text written to the file. In addition to "boilerplate" or literal text, the format string may contain the following patterns:
The %s formatters are replaced by the argument strings in the order provided. If you do not pass in enough values to replace all of the formatters, then the %s is simply removed from the string before writing it to the file. The following example illustrates how to use the format string. Suppose you want the contents of the file to look like this:
This single call to PUTF will accomplish the task:
If you need to write out more than five items of data, you can simply call PUTF twice consecutively to finish the job. 21.2.8. Copy FilesUTL_FILE.FCOPY lets you easily copy the contents of one source file to another destination file. The following snippet, for example, uses UTL_FILE.FCOPY to perform a backup by copying a single file from the development directory to the archive directory:
You can also use FCOPY to copy just a portion of a file. The program offers two additional parameters that allow you to specify the starting and ending line numbers you want to copy from the file. Suppose that I have a text file containing the names of the winners of a monthly PL/SQL quiz that started in January 2000. I would like to transfer all the names in 2001 to another file. I can do that by taking advantage of the fifth and sixth arguments of the FCOPY procedure as shown below:
A useful encapsulation to UTL_FILE.FCOPY allows me to specify start and end strings instead of line numbers. We will leave the implementation of such a utility as an exercise for the reader (see the infile.sf file on the book's web site for an implementation of an "INSTR for files" that might give you some ideas on implementation). 21.2.9. Delete FilesBeginning with Oracle9i Database Release 2, you can remove files using UTL_FILE.FREMOVE. The header for this program is:
That's simple enough. You provide the location and name of the file, and UTL_FILE attempts to delete it. What if UTL_FILE encounters a problem? You might see one of the following exceptions then raised:
In other words, UTL_FILE will raise an exception if you try to remove a file that doesn't exist or if you do not have the privileges needed to remove the file. Many file-removal programs in other languages (for example, File.delete in Java) return a status code to inform you of the outcome of the removal attempt. If you prefer this approach, you can use (or copy) the fileIO.fremove program found in the fileio.pkg file on the book's web site. 21.2.10. Rename and Move FilesI can combine copy and remove operations into a single step by calling the UTL_FILE.RENAME procedure. This handy utility allows me to either rename a file in the same directory or to rename a file to another name and location (in effect, moving that file). Here is the header for FRENAME:
This program may raise one of the following exceptions:
You will find an interesting application of FRENAME in the fileIO packagethe chgext procedure. This program changes the extension of the specified file. 21.2.11. Retrieve File AttributesSometimes we need to get information about the file in question: How big is this file? Does a file even exist? What is the block size of my file? Such questions are not mysteries that can only be solved with the help of an operating system command (or, in the case of the file length, the DBMS_LOB package), as they were in early Oracle releases. UTL_FILE.FGETATTR provides us with that information in a single native procedure call. Here is the header for FGETATTR:
Thus, to use this program, we must declare three different variables to hold the Boolean flag (does the file exist?), the length of the file, and the block size. Here is a sample usage:
This interface is a bit awkward. Suppose we just want to find out the length of this file? We still have to declare all those variables, obtain the length, and then work with that value. Perhaps the best way to take advantage of FGETATTR is to build some of your own functions on top of this built-in that answer a single question, such as:
or:
As a result, you do not have to declare unneeded variables, and you can write simpler, cleaner code. |
Monday, October 26, 2009
Section 21.2. Reading and Writing Files
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment