Monday, January 11, 2010

Load Utility



[ Team LiB ]





Load Utility


The LOAD UTILITY is used to load data into a DB2 table. Data residing on the server may be in the form of a file, tape, or named pipe. Data residing on a remotely connected client may be in the form of a fully qualified file or named pipe. New in v8, data can also be loaded from a user-defined cursor. The load utility does not support loading data at the hierarchy level. As indicated in Figure 7.7, the Load Wizard can assist you in quickly loading data into tables.


Figure 7.7. Load Wizard example.


As with all of the DB2 wizards, the Load Wizard can help you to load data quickly and seamlessly, resulting in improved productivity.


Scope


This command may be issued against multiple database partitions in a single request.


Authorization


One of the following:


  • sysadm

  • dbadm

  • load authority on the database and

    • INSERT privilege on the table when the load utility is invoked in INSERT mode, TERMINATE mode (to terminate a previous load replace operation), or RESTART mode (to restart a previous load replace operation)

    • INSERT and DELETE privilege on the table when the load utility is invoked in REPLACE mode, TERMINATE mode (to terminate a previous load replace operation), or RESTART mode (to restart a previous load replace operation)

    • INSERT privilege on the exception table, if such a table is used as part of the load operation.


NOTE



Since all load process (and all DB2 server processes, in general) are owned by the instance owner, and all of these processes use the identification of the instance owner to access needed files, the instance owner must have read access to input data files. These input data files must be readable by the instance owner, regardless of who invokes the command.



Required connection


Database:
If implicit connect is enabled, a connection to the default database is established. Instance: An explicit attachment is not required. If a connection to the database has been established, an implicit attachment to the local instance is attempted.


NOTE



In v8, the LOAD UTILITY now operates at the table level instead of the tablespace level. This enhancement enables DB2 to deliver higher degrees of availability than ever before and is a welcomed enhancement in the field.



Command Parameters


ALLOW NO ACCESS

Load will lock the target table for exclusive access during the load. The table state will be set to LOAD IN PROGRESS during the load. ALLOW NO ACCESS is the default behavior. It is the only valid option for LOAD REPLACE.


When there are constraints on the table, the table state will be set to CHECK PENDING as well as LOAD IN PROGRESS. The SET INTEGRITY command must be used to take the table out of CHECK PENDING.


ALLOW READ ACCESS

Load will lock the target table in a share mode. The table state will be set to both LOAD IN PROGRESS and READ ACCESS. Readers may access the nondelta portion of the data while the table is being loaded. In other words, data that existed before the start of the load will be accessible by readers to the table; data that is being loaded is not available until the load is complete. LOAD TERMINATE or LOAD RESTART of an ALLOW READ ACCESS load may use this option; LOAD TERMINATE or LOAD RESTART of an ALLOW NO ACCESS load may not use this option. Furthermore, this option is not valid if the indexes on the target table are marked as requiring a rebuild.


When there are constraints on the table, the table state will be set to CHECK PENDING as well as LOAD IN PROGRESS and READ ACCESS. At the end of the load the table state LOAD IN PROGRESS will be removed but the table states CHECK PENDING and READ ACCESS will remain. The SET INTEGRITY command must be used to take the table out of CHECK PENDING. While the table is in CHECK PENDING and READ ACCESS, the nondelta portion of the data is still accessible to readers, the new (delta) portion of the data will remain inaccessible until the SET INTEGRITY command has completed. A user may perform multiple loads on the same table without issuing a SET INTEGRITY command. Only the original (checked) data will remain visible, however, until the SET INTEGRITY command is issued.


ALLOW READ ACCESS also supports the following modifiers:


USE tablespace-name
If the indexes are being rebuilt, a shadow copy of the index is built in tablespace tablespace-name and copied over to the original tablespace at the end of the load during an INDEX COPY PHASE. Only system-temporary tablespaces can be used with this option. If not specified, then the shadow index will be created in the same tablespace as the index object. If the shadow copy is created in the same tablespace as the index object, the copy of the shadow index object over the old index object is instantaneous. If the shadow copy is in a different tablespace from the index object, a physical copy is performed. This could involve considerable I/O and time. The copy happens while the table is offline at the end of a load during the INDEX COPY PHASE.

Without this option the shadow index is built in the same tablespace as the original. Since both the original index and shadow index by default reside in the same tablespace simultaneously, there may be insufficient space to hold both indexes within one tablespace. Using this option ensures that you retain enough tablespace for the indexes.

This option is ignored if the user does not specify INDEXING MODE REBUILD or INDEXING MODE AUTOSELECT. This option will also be ignored if INDEXING MODE AUTOSELECT is chosen and load chooses to incrementally update the index.


CHECK PENDING CASCADE

If LOAD puts the table into a check pending state, the CHECK PENDING CASCADE option allows the user to specify whether or not the check pending state of the loaded table is immediately cascaded to all descendents (including descendent foreign key tables, and immediate staging tables).


IMMEDIATE

Indicates that the check pending state (read or no access mode) for foreign key constraints is immediately extended to all descendent foreign key tables. If the table has descendent immediate materialized query tables or descendent immediate staging tables, the check pending state is extended immediately to the materialized query tables and the staging tables. Note that for a LOAD INSERT operation, the check pending state is not extended to descendent foreign key tables even if the IMMEDIATE option is specified.


When the loaded table is later checked for constraint violations (using the IMMEDIATE CHECKED option of the SET INTEGRITY statement), descendent foreign key tables that were placed in check, pending a read state, will be put into check pending no access state.


DEFERRED

Indicates that only the loaded table will be placed in the check pending state (read or no access mode). The states of the descendent foreign key tables, descendent immediate materialized query tables, and descendent immediate staging tables will remain unchanged.


Descendent foreign key tables may later be implicitly placed in the check pending no access state when their parent tables are checked for constraint violations (using the IMMEDIATE CHECKED option of the SET INTEGRITY statement). Descendent immediate materialized query tables and descendent immediate staging tables will be implicitly placed in the check pending no access state when one of its underlying tables is checked for integrity violations. A warning (SQLSTATE 01586) will be issued to indicate that dependent tables have been placed in the check pending state. See the Notes section of the SET INTEGRITY statement in the SQL Reference for when these descendents tables will be put into the check pending state.


If the CHECK PENDING CASCADE option is not specified:


  • Only the loaded table will be placed in the check pending state. The state of descendent foreign key tables, descendent immediate materialized query tables, and descendent immediate staging tables will remain unchanged, and may later be implicitly put into the check pending state when the loaded table is checked for constraint violations.

  • If LOAD does not put the target table into check pending state, the CHECK PENDING CASCADE option is ignored.


CLIENT

Specifies that the data to be loaded resides on a remotely connected client. This option is ignored if the load operation is not being invoked from a remote client. This option is not supported in conjunction with the CURSOR filetype.


Notes

  1. The DUMPFILE and LOBSINFILE modifier refer to files on the server even when the CLIENT keyword is specified.

  2. Code page conversion is not performed during a remote load operation. If the code page of the data is different from that of the server, the data code page should be specified using the CODEPAGE modifier.


In the following example, a data file (/u/user/data.del) residing on a remotely connected client is to be loaded into MYTABLE on the server database:



db2 load client from /u/user/data.del
modified by codepage = 850 insert into mytable

COPY NO

Specifies that the tablespace in which the table resides will be placed in backup pending state if forward recovery is enabled (i.e., logretain or userexit is on). Copy no will also put the tablespace state into the Load in Progress tablespace state. This is a transient state that will disappear when the load completes or aborts. The data in any table in the tablespace cannot be updated or deleted until a tablespace backup or a full database backup is made. However, it is possible to access the data in any table by using the SELECT statement.


COPY YES

Specifies that a copy of the loaded data will be saved. This option is invalid if forward recovery is disabled (both logretain and userexit are off). The option is not supported for tables with DATALINK columns.


USE TSM
Specifies that the copy will be stored using Tivoli Storage Manager (TSM).

OPEN num-sess SESSIONS
The number of I/O sessions to be used with TSM or the vendor product. The default value is 1.

TO device/directory
Specifies the device or directory on which the copy image will be created.

LOAD lib-name
The name of the shared library (DLL on Windows operating systems) containing the vendor backup and restore I/O functions to be used. It may contain the full path. If the full path is not given, it will default to the path where the user exit programs reside.


CPU_PARALLELISM n

Specifies the number of processes or threads that the load utility will spawn for parsing, converting, and formatting records when building table objects. This parameter is designed to exploit intrapartition parallelism. It is particularly useful when loading presorted data, because record order in the source data is preserved. If the value of this parameter is zero, or has not been specified, the load utility uses an intelligent default value (usually based on the number of CPUs available) at run time.


Notes

  1. If this parameter is used with tables containing either LOB or LONG VARCHAR fields, its value becomes one, regardless of the number of system CPUs or the value specified by the user.

  2. Specifying a small value for the SAVECOUNT parameter causes the loader to perform many more I/O operations to flush both data and table metadata. When CPU_PARALLELISM is greater than one, the flushing operations are asynchronous, permitting the loader to exploit the CPU. When CPU_PARALLELISM is set to 1, the loader waits on I/O during consistency points. A load operation with CPU_PARALLELISM set to 2, and SAVECOUNT set to 10,000, completes faster than the same operation with CPU_PARALLELISM set to 1, even though there is only one CPU.


DATA BUFFER buffer-size

Specifies the number of 4 KB pages (regardless of the degree of parallelism) to use as buffered space for transferring data within the utility. If the value specified is less than the algorithmic minimum, the minimum required resource is used, and no warning is returned.


This memory is allocated directly from the utility heap, whose size can be modified through the util_heap_sz database configuration parameter.


If a value is not specified, an intelligent default is calculated by the utility at run time. The default is based on a percentage of the free space available in the utility heap at the instantiation time of the loader, as well as some characteristics of the table.


DISK_PARALLELISM n

Specifies the number of processes or threads that the load utility will spawn for writing data to the tablespace containers. If a value is not specified, the utility selects an intelligent default based on the number of tablespace containers and the characteristics of the table.


FOR EXCEPTION table-name

Specifies the exception table into which rows in error will be copied. Any row that is in violation of a unique index or a primary key index is copied. DATALINK exceptions are also captured in the exception table. If an unqualified table name is specified, the table will be qualified with the CURRENT SCHEMA.


Information that is written to the exception table is not written to the dump file. In a partitioned database environment, an exception table is defined for those partitions on which the loading table is defined. The dump file, on the other hand, contains rows that cannot be loaded because they are invalid or have syntax errors.


FROM filename/pipename/device/cursorname

Specifies the file, pipe, device, or cursor referring to an SQL statement that contains the data being loaded. If the input source is a file, pipe, or device, it must reside on the database partition where the database resides, unless the CLIENT option is specified. If several names are specified, they will be processed in sequence. If the last item specified is a tape device, the user is prompted for another tape. Valid response options are


c

Continue.
Continue using the device that generated the warning message (e.g., when a new tape has been mounted).

d

Device terminate.
Stop using the device that generated the warning message (e.g., when there are no more tapes).

t

Terminate.
Terminate all devices.


Notes

  1. It is recommended that the fully qualified file name be used. If the server is remote, the fully qualified file name must be used. If the database resides on the same database partition as the caller, relative paths may be used.

  2. Loading data from multiple IXF files is supported if the files are physically separate, but logically one file. It is not supported if the files are both logically and physically separate. (Multiple physical files would be considered logically one if they were all created with one invocation of the EXPORT command.)

  3. If loading data that resides on a client machine, the data must be in the form of either a fully qualified file or a named pipe.


INDEXING MODE

Specifies whether the load utility is to rebuild indexes or to extend them incrementally. Valid values are:


AUTOSELECT
The load utility will automatically decide between REBUILD or INCREMENTAL mode.

REBUILD
All indexes will be rebuilt. The utility must have sufficient resources to sort all index key parts for both old and appended table data.

INCREMENTAL
Indexes will be extended with new data. This approach consumes index free space. It only requires enough sort space to append index keys for the inserted records. This method is only supported in cases where the index object is valid and accessible at the start of a load operation (it is, for example, not valid immediately following a load operation in which the DEFERRED mode was specified). If this mode is specified, but not supported due to the state of the index, a warning is returned, and the load operation continues in REBUILD mode. Similarly, if a load restart operation is begun in the load build phase, INCREMENTAL mode is not supported.

Incremental indexing is not supported when all of the following conditions are true:

  • The LOAD COPY option is specified (logretain or userexit is enabled).

  • The table resides in a DMS tablespace.

  • The index object resides in a tablespace that is shared by other table objects belonging to the table being loaded.

To bypass this restriction, it is recommended that indexes be placed in a separate tablespace.

DEFERRED
The load utility will not attempt index creation if this mode is specified. Indexes will be marked as needing a refresh. The first access to such indexes that is unrelated to a load operation may force a rebuild, or indexes may be rebuilt when the database is restarted. This approach requires enough sort space for all key parts for the largest index. The total time subsequently taken for index construction is longer than that required in REBUILD mode. Therefore, when performing multiple load operations with deferred indexing, it is advisable (from a performance viewpoint) to let the last load operation in the sequence perform an index rebuild rather than allow indexes to be rebuilt at first nonload access.

Deferred indexing is only supported for tables with nonunique indexes, so that duplicate keys inserted during the load phase are not persistent after the load operation.

NOTE



Deferred indexing is not supported for tables that have DATALINK columns.


INSERT
One of four modes under which the load utility can execute. Adds the loaded data to the table without changing the existing table data.

insert-column
Specifies the table column into which the data is to be inserted.

The load utility cannot parse columns whose names contain one or more spaces. For example,


db2 load from delfile1 of del modified by noeofchar noheader
method P (1, 2, 3, 4, 5, 6, 7, 8, 9)
insert into table1 (BLOB1, S2, I3, Int 4, I5, I6, DT7, I8, TM9)

INTO table-name
Specifies the database table into which the data is to be loaded. This table cannot be a system table or a declared temporary table. An alias or the fully qualified or unqualified table name can be specified. A qualified table name is in the form schema.tablename. If an unqualified table name is specified, the table will be qualified with the CURRENT SCHEMA.


LOCK WITH FORCE

The utility acquired various locks, including table locks, in the process of loading. Rather than wait, and possibly timeout, when acquiring a lock, this option allows the load to force off other applications that hold conflicting locks. Forced applications will roll back and release the locks the load utility needs. The load utility can then proceed. This option requires the same authority as the FORCE APPLICATIONS command (SYSADM or SYSCTRL).


NOTE



The new LOCK WTH FORCE option enables a load utility to force applications holding locks that are preventing a load from completing.



ALLOW NO ACCESS loads may force applications holding conflicting locks at the start of the load operation. At the start of the load the utility may force applications that are attempting to either query or modify the table.


ALLOW READ ACCESS loads may force applications holding conflicting locks at the start or end of the load operation. At the start of the load the load utility may force applications that are attempting to modify the table. At the end of the load the load utility may force applications that are attempting to either query or modify the table.


MESSAGES message-file

Specifies the destination for warning and error messages that occur during the load operation. If a message file is not specified, messages are written to standard output. If the complete path to the file is not specified, the load utility uses the current directory and the default drive as the destination. If the name of a file that already exists is specified, the utility appends the information.


The message file is usually populated with messages at the end of the load operation and, as such, is not suitable for monitoring the progress of the operation.


METHOD

L

Specifies the start and end column numbers from which to load data. A column number is a byte offset from the beginning of a row of data. It is numbered starting from 1.


NOTE


This method can only be used with ASC files, and is the only valid method for that file type.


N

Specifies the names of the columns in the data file to be loaded. The case of these column names must match the case of the corresponding names in the system catalogs. Each table column that is not nullable should have a corresponding entry in the METHOD N list. For example, given data fields F1, F2, F3, F4, F5, and F6, and table columns C1 INT, C2 INT NOT NULL, C3 INT NOT NULL, and C4 INT, method N (F2, F1, F4, F3) is a valid request, while method N (F2, F1) is not valid.


NOTE


This method can only be used with file types IXF or CURSOR.


P

Specifies the field numbers (numbered from 1) of the input data fields to be loaded. Each table column that is not nullable should have a corresponding entry in the METHOD P list. For example, given data fields F1, F2, F3, F4, F5, and F6, and table columns C1 INT, C2 INT NOT NULL, C3 INT NOT NULL, and C4 INT, method P (2, 1, 4, 3) is a valid request, while method P (2, 1) is not valid.


NOTE


This method can only be used with file types IXF, DEL, or CURSOR, and is the only valid method for the DEL file type.



MODIFIED BY filetype-mod

Specifies additional options.


NONRECOVERABLE

Specifies that the load transaction is to be marked as nonrecoverable and that it will not be possible to recover it by subsequent roll-forward action. The roll-forward utility will skip the transaction and will mark the table into which data was being loaded as "invalid." The utility will also ignore any subsequent transactions against that table. After the roll-forward operation is completed, such a table can only be dropped or restored from a backup (full or tablespace) taken after a commit point following the completion of the nonrecoverable load operation.


With this option, tablespaces are not put in backup pending state following the load operation, and a copy of the loaded data does not have to be made during the load operation.


This option should not be used when DATALINK columns with the FILE LINK CONTROL attribute are present in, or being added to, the table.


NULL INDICATORS null-indicate-list

This option can only be used when the METHOD L parameter is specified; that is, the input file is an ASC file. The null indicator list is a comma-separated list of positive integers specifying the column number of each null indicator field. The column number is the byte offset of the null indicator field from the beginning of a row of data. There must be one entry in the null indicator list for each data field defined in the METHOD L parameter. A column number of zero indicates that the corresponding data field always contains data.


A value of Y in the NULL indicator column specifies that the column data is NULL. Any character other than Y in the NULL indicator column specifies that the column data is not NULL, and that column data specified by the METHOD L option will be loaded.


The NULL indicator character can be changed using the MODIFIED BY option.


OF filetype

Specifies the format of the data:


  • ASC (nondelimited ASCII format)

  • DEL (delimited ASCII format)

  • IXF (integrated exchange format, PC version), exported from the same or from another DB2 table

  • CURSOR (a cursor declared against a SELECT or VALUES statement)


REPLACE

One of four modes under which the load utility can execute. Deletes all existing data from the table and inserts the loaded data. The table definition and index definitions are not changed. If this option is used when moving data between hierarchies, only the data for an entire hierarchy, not individual subtables, can be replaced.


This option is not supported for tables with DATALINK columns.


RESTART

One of four modes under which the load utility can execute. Restarts a previously interrupted load operation. The load operation will automatically continue from the last consistency point in the load, build, or delete phase.


RESTARTCOUNT

Reserved.


ROWCOUNT n

Specifies the number of n physical records in the file to be loaded. Allows a user to load only the first n rows in a file.


SAVECOUNT n

Specifies that the load utility is to establish consistency points after every n rows. This value is converted to a page count, and rounded up to intervals of the extent size. Since a message is issued at each consistency point, this option should be selected if the load operation will be monitored using LOAD QUERY. If the value of n is not sufficiently high, the synchronization of the activities performed at each consistency point will impact performance.


The default value is zero, meaning that no consistency points will be established, unless necessary.


This option is not supported in conjunction with the CURSOR filetype.


SORT BUFFER buffer-size

This option specifies a value that overrides the SORTHEAP database configuration parameter during a load operation. It is relevant only when loading tables with indexes and only when the INDEXING MODE parameter is not specified as DEFERRED. The value that is specified cannot exceed the value of SORTHEAP. This parameter is useful for throttling the sort memory that is used when loading tables with many indexes without changing the value of SORTHEAP, which would also affect general query processing.


STATISTICS NO

Specifies that no statistics are to be collected, and that the statistics in the catalogs are not to be altered. This is the default.


STATISTICS YES

Specifies that statistics are to be collected for the table and for any existing indexes. This option is supported only if the load operation is in REPLACE mode.


WITH DISTRIBUTION
Specifies that distribution statistics are to be collected.

AND INDEXES ALL
Specifies that both table and index statistics are to be collected.

FOR INDEXES ALL
Specifies that only index statistics are to be collected.

DETAILED
Specifies that extended index statistics are to be collected.


TEMPFILES PATH temp-pathname

Specifies the name of the path to be used when creating temporary files during a load operation, and should be fully qualified according to the server database partition.


Temporary files take up file system space. Sometimes this space requirement is quite substantial. Following is an estimate of how much file system space should be allocated for all temporary files:


  • 4 bytes for each duplicate or rejected row containing DATALINK values

  • 136 bytes for each message that the load utility generates

  • 15 KB overhead if the data file contains long field data or LOBs. This quantity can grow significantly if the INSERT option is specified, and there is a large amount of long field or LOB data already in the table.


TERMINATE

One of four modes under which the load utility can execute. Terminates a previously interrupted load operation to the point in time at which it started, even if consistency points were passed. The states of any tablespaces involved in the operation return to normal, and all table objects are made consistent (index objects may be marked as invalid, in which case index rebuild will automatically take place at next access). If the load operation being terminated is a load REPLACE, the table will be truncated to an empty table after the load TERMINATE operation. If the load operation being terminated is a load INSERT, the table will retain all of its original records after the load TERMINATE operation.


The load terminate option will not remove a backup pending state from tablespaces.


NOTE



This option is not supported for tables with DATALINK columns.



USING directory

This option has been reserved for future use.


WARNINGCOUNT n

Stops the load operation after n warnings. Set this parameter if no warnings are expected, but verification that the correct file and table are being used is desired. If n is zero, or this option is not specified, the load operation will continue regardless of the number of warnings issued. If the load operation is stopped because the threshold of warnings was encountered, another load operation can be started in RESTART mode. The load operation will automatically continue from the last consistency point. Alternatively, another load operation can be initiated in REPLACE mode, starting at the beginning of the input file.


WITHOUT PROMPTING

Specifies that the list of data files contains all the files that are to be loaded, and that the devices or directories listed are sufficient for the entire load operation. If a continuation input file is not found, or the copy targets are filled before the load operation finishes, the load operation will fail, and the table will remain in load pending state.


If this option is not specified, and the tape device encounters as end of tape for the copy image, or the last item listed is a tape device, the user is prompted for a new tape on that device.





    [ Team LiB ]



    No comments: