Loads data from files, tapes, or named pipes into a DB2 table. The load utility does not support loading data at the hierarchy level.
Scope
This command only affects the node on which it is executed.
In a multi-node environment, this command can be used only with ASC or DEL files. IXF files can be loaded only if the table exists on a single node nodegroup.
Authorization
One of the following:
Note: | Since all load processes (and all DB2 server processes, in general), are owned by the instance owner, and all of these processes use the identification of the instance owner to access needed files, the instance owner must have read access to input data files. These input data files must be readable by the instance owner, regardless of who invokes the command. |
Required Connection
Database. If implicit connect is enabled, a connection to the default database is established.
Instance. An explicit attachment is not required. If a connection to the database has been established, an implicit attachment to the local instance is attempted.
Command Syntax
.-,-------------. V | >>-LOAD FROM------+-filename-+--+--OF--filetype-----------------> +-pipename-+ '-device---' >-----+-----------------------------+---------------------------> | .-,-----------. | | V | | '-LOBS FROM-----lob-path---+--' >-----+-----------------------------------+---------------------> | .-----------------. | | V | | '-MODIFIED BY-----filetype-mod---+--' >-----+-----------------------------------------------------------------------------------------------+> | .-,---------------------------. | | V | | '-METHOD--+-L--(-----column-start--column-end---+---)--+-----------------------------------+-+--' | | .-,----. | | | | V | | | | '-NULL INDICATORS--(-----n---+---)--' | | .-,--------------. | | V | | +-N--(-----column-name---+---)-----------------------------------------------------+ | .-,------------------. | | V | | '-P--(-----column-position---+---)-------------------------------------------------' >-----+---------------+---+----------------------+--------------> '-SAVECOUNT--n--' '-RESTARTCOUNT--+-B-+--' +-D-+ '-n-' >-----+--------------+---+------------------+-------------------> '-ROWCOUNT--n--' '-WARNINGCOUNT--n--' >-----+-------------------------+-------------------------------> '-MESSAGES--message-file--' >-----+---------------------------+---+-INSERT----+-------------> '-REMOTE FILE--remote-file--' +-REPLACE---+ +-RESTART---+ '-TERMINATE-' >----INTO--table-name----+------------------------------+-------> | .-,----------------. | | V | | '-(-----insert-column---+---)--' >-----+--------------------+---+----------------------------+---> '-| datalink-spec |--' '-FOR EXCEPTION--table-name--' >-----+-------------------------------------------------------------------------------------------+> | .-YES-. | '-STATISTICS--+-+-----+--+-+---------------------------------------------------------+-+-+--' | | '-WITH DISTRIBUTION--+---------------------------------+--' | | | | '-AND--+----------+--INDEXES ALL--' | | | | '-DETAILED-' | | | '-+---------------------------------------+-------------------' | | '--+-AND-+---+----------+--INDEXES ALL--' | | '-FOR-' '-DETAILED-' | '-NO-----------------------------------------------------------------------' >-----+------------------------------------------------------------------+> | .-NO-----------------------------------------------------. | +-COPY--+-YES--+-USE ADSM--+---------------------------+-------+-+-+ | | '-OPEN--num-sess--SESSIONS--' | | | | .-,-------------------. | | | | V | | | | +-TO-----device/directory---+-------------------+ | | '-LOAD--lib-name--+---------------------------+-' | | '-OPEN--num-sess--SESSIONS--' | '-NONRECOVERABLE---------------------------------------------------' >-----+--------------------------+---+---------------+----------> | .-,------------. | '-HOLD QUIESCE--' | V | | '-USING-----directory---+--' >-----+--------------------+---+---------------------------+----> '-WITHOUT PROMPTING--' '-DATA BUFFER--buffer-size--' >-----+---------------------------+---+---------------------+---> '-SORT BUFFER--buffer-size--' '-CPU_PARALLELISM--n--' >-----+----------------------+--------------------------------->< '-DISK_PARALLELISM--n--' datalink-spec .-,--------------------------------------------------------------------------------------------------. V | |------(--+-----------------+--+----------------------------------+---+--------------------------+---)---+-> '-DL_LINKTYPE URL-' +-DL_URL_REPLACE_PREFIX--"prefix"--+ '-DL_URL_SUFFIX--"suffix"--' '-DL_URL_DEFAULT_PREFIX--"prefix"--' >---------------------------------------------------------------| |
Command Parameters
Notes:
For more information about file formats, see the Appendix C. IMPORT/EXPORT/LOAD Utility File Formats.
Note: | This method can only be used with ASC files, and is the only valid option for that file type. |
Note: | This method can only be used with IXF files. |
Note: | This method can only be used with IXF or DEL files, and is the only valid option for the DEL file type. |
A value of Y in the NULL indicator column specifies that the column data is NULL. Any character other than Y in the NULL indicator column specifies that the column data is not NULL, and that column data specified by the METHOD L option will be loaded.
The NULL indicator character can be changed using the MODIFIED BY option (see ***).
The default value is 0, meaning that no consistency points will be established, unless necessary.
This option can be specified with any of the INSERT, REPLACE, or RESTART modes. B or D must not be specified for the INSERT or the REPLACE mode.
Note: | It may be necessary to quiesce the table spaces prior to invoking a load
restart. To do this, issue:
db2 quiesce tablespaces for table tablename exclusive |
If the load is stopped because the threshold of warnings was encountered, another load can be started in RESTART mode by specifying the RESTARTCOUNT option. Alternatively, another load can be initiated in REPLACE mode, starting at the beginning of the input file.
If the complete path to the file is not specified, LOAD uses the current directory and the default drive as the destination.
If the name of a file that already exists is specified, LOAD appends the information.
These temporary files include a .log (load crash recovery log) file, a .rid file (containing a list of row identifiers for records that will be deleted during the delete phase), a .dlr file (containing a list of row identifiers for records that will be deleted due to DATALINK violations), and a binary .msg file (containing load messages). None of these files are viewed directly by the user. The size of each depends on a number of variables:
For more information about remote files, see ***.
It is important to keep track of the last commit point. This information is stored in the message file and is passed to LOAD. Use LOAD QUERY to get this information if the database connection was lost during the load.
Notes:
The load utility cannot parse columns whose names contain one or more spaces. For example,
db2 load from delfile1 of del modified by noeofchar noheader method P (1, 2, 3, 4, 5, 6, 7, 8, 9) insert into table1 (BLOB1, S2, I3, Int 4, I5, I6, DT7, I8, TM9)
will fail because of the Int 4 column. The solution is to enclose such column names with double quotation marks:
db2 load from delfile1 of del modified by noeofchar noheader method P (1, 2, 3, 4, 5, 6, 7, 8, 9) insert into table1 (BLOB1, S2, I3, "Int 4", I5, I6, DT7, I8, TM9)
Information that is written to the exception table is not written to the dump file (for a description of the dumpfile modifier, see Table 8). The exception table contains rows that are in violation of a unique index or a primary key index. In a partitioned database environment, an exception table must be defined for those nodes on which the loading table is defined. The dump file, on the other hand, contains rows that cannot be loaded because they are invalid or have syntax errors.
With this option, table spaces are not put in backup pending state following the load operation, and a copy of the loaded data does not have to be made during the load.
Note: | The size of this directory should be at least 130% of the index size. |
db2 quiesce tablespaces for table tablename reset
Note: | Ensure that no phantom quiesces are created (see QUIESCE TABLESPACES FOR TABLE). |
If this option is not specified, and the tape device encounters an end of tape for the copy image, or the last item listed is a tape device, the user is prompted for a new tape on that device.
This memory is allocated directly from the utility heap, whose size can be modified through the util_heap_sz database configuration parameter.
If a value is not specified, an intelligent default is calculated by the utility at run time. The default is based on a percentage of the free space available in the utility heap at the instantiation time of the loader, as well as some characteristics of the table.
Note: | Sort buffer size has a very large impact on sort performance. Therefore, for very large tables (for example, tables in excess of 100M), this buffer should be set as large as possible. |
If a value is not specified, the utility uses the larger of:
If a value greater than zero, but less than the required minimum is specified, the minimum value for that load is used.
Notes:
Examples of prefix are:
"http://server" "file://server" "file:" "http://server:80"
If no prefix is found in a column's data, and a default prefix is specified with DL_URL_DEFAULT_PREFIX, the default prefix is prefixed to the column value (if not NULL).
For example, if DL_URL_DEFAULT_PREFIX specifies the default prefix "http://toronto":
For example, if DL_URL_REPLACE_PREFIX specifies the prefix "http://toronto":
Examples
Example 1
TABLE1 has 5 columns:
ASCFILE1 has 6 elements:
Data Records:
1...5....10...15...20...25...30...35...40 Test data 1 XXN 123abcdN Test data 2 and 3 QQY wxyzN Test data 4,5 and 6 WWN6789 Y
The following command loads the table from the file:
db2 load from ascfile1 of asc modified by striptblanks reclen=40 method L (1 20, 21 22, 24 27, 28 31) null indicators (0,0,23,32) insert into table1 (col1, col5, col2, col3)
Notes:
Example 2 (Loading LOBs from Files)
TABLE1 has 3 columns:
ASCFILE1 has 3 elements:
The following files reside in either /u/user1 or /u/user1/bin:
Data Records in ASCFILE1:
1...5....10...15...20...25...30. REC1 ASCFILE2 ASCFILE3 REC2 ASCFILE4 ASCFILE5 REC3 ASCFILE6 ASCFILE7
The following command loads the table from the file:
db2 load from ascfile1 of asc lobs from /u/user1, /u/user1/bin modified by lobsinfile reclen=22 method L (1 4, 6 13, 15 22) insert into table1
Notes:
Example 3 (Using Dump Files)
Table FRIENDS is defined as:
table friends "( c1 INT NOT NULL, c2 INT, c3 CHAR(8) )"
If an attempt is made to load the following data records into this table,
23, 24, bobby , 45, john 4,, mary
the second row is rejected because the first INT is NULL, and the column definition specifies NOT NULL. Columns which contain initial characters that are not consistent with the DEL format will generate an error, and the record will be rejected. Such records can be written to a dump file (see Table 8).
DEL data appearing in a column outside of character delimiters is ignored, but does generate a warning. For example:
22,34,"bob" 24,55,"sam" sdf
The utility will load "sam" in the third column of the table, and the characters "sdf" will be flagged in a warning. The record is not rejected. Another example:
22 3, 34,"bob"
The utility will load 22,34,"bob", and generate a warning that some data in column one following the 22 was ignored. The record is not rejected.
Example 4 (Loading DATALINK Data)
The following command loads the table MOVIETABLE from the input file delfile1, which has data in the DEL format:
db2 load from delfile1 of del modified by dldel| insert into movietable (actorname, description, url_making_of, url_movie) (dl_url_default_prefix "http://narang"), (dl_url_replace_prefix "http://bomdel" dl_url_suffix ".mpeg") for exception excptab
Notes:
actorname VARCHAR(n) description VARCHAR(m) url_making_of DATALINK (with LINKTYPE URL) url_movie DATALINK (with LINKTYPE URL)
Usage Notes
Data is loaded in the sequence that appears in the input file. If a particular sequence is desired, the data should be sorted before a load is attempted.
The load utility builds indexes based on existing definitions. The exception tables are used to handle duplicates on unique keys. The utility does not enforce referential integrity, perform constraints checking, or update summary tables that are dependent on the tables being loaded. Tables being loaded that include referential or check constraints are placed in check pending state. Summary tables dependent on tables being loaded are also placed in check pending state. Issue the SET CONSTRAINTS statement to take the tables out of check pending state. Load operations cannot be carried out on replicated summary tables.
If clustering is required, the data should be sorted on the clustering index prior to loading.
If multiple concurrent load operations are to be run, unique USING and REMOTE FILE values must be specified, otherwise temporary files created by the different load operations may conflict.
DB2 File Manager Considerations
For each DATALINK column, there can be one column specification within parentheses. Each column specification consists of one or more of DL_LINKTYPE, prefix and a DL_URL_SUFFIX specification. The prefix information can be either DL_URL_REPLACE_PREFIX, or the DL_URL_DEFAULT_PREFIX specification.
There can be as many DATALINK column specifications as the number of DATALINK columns defined in the table. The order of specifications follows the order of DATALINK columns as found within the insert-column list (if specified by INSERT INTO (insert-column, ...)), or within the table definition (if insert-column is not specified).
For example, if a table has columns C1, C2, C3, C4, and C5, and among them only columns C2 and C5 are of type DATALINK, and the insert-column list is (C1, C5, C3, C2), there should be two DATALINK column specifications. The first column specification will be for C5, and the second column specification will be for C2. If an insert-column list is not specified, the first column specification will be for C2, and the second column specification will be for C5.
If there are multiple DATALINK columns, and some columns do not need any particular specification, the column specification should have at least the parentheses to unambiguously identify the order of specifications. If there are no specifications for any of the columns, the entire list of empty parentheses can be dropped. Thus, in cases where the defaults are satisfactory, there need not be any DATALINK specification.
It is possible that while running the load utility, the connection between DB2 and the file server may fail. This would cause the load operation to fail. The user response to this problem is as follows:
Representation of DATALINK Information in an Input File
The LINKTYPE (currently only URL is supported) is not specified as part of DATALINK information. The LINKTYPE is specified in the LOAD or the IMPORT command, and for input files of type PC/IXF, in the appropriate column descriptor records as described in the Appendix C. IMPORT/EXPORT/LOAD Utility File Formats.
The syntax of DATALINK information for a URL LINKTYPE is as follows:
>>-+---------+----+--------------------------+----------------->< '-urlname-' '-dl_delimiter--"comment"--'
Note that both urlname and comment are optional. If neither is provided, the null value is assigned.
Notes:
If no comment is specified, the comment defaults to a string of length zero.
Following are valid examples of data of type DATALINK (assume that the LOAD or IMPORT specification for the column is DL_URL_DEFAULT_PREFIX "http://qso"):
This is stored with the following parts:
This is stored with the following parts:
This is stored with the following parts:
This is stored with the following parts:
This is stored with the following parts:
Note: | The load utility does not issue a warning if an attempt is made to use unsupported file types with the MODIFIED BY option. If this is attempted, the load fails, and an error code is returned. |
Table 8. Valid File Type Modifications (LOAD)
Modification | Description | ||
---|---|---|---|
All File Formats | |||
anyorder | This modifier is used in conjunction with the cpu_parallelism parameter. Specifies that the preservation of source data order is not required, yielding significant additional performance benefit on SMP systems. If the value of cpu_parallelism is 1, this option is ignored. This option is not supported if SAVECOUNT > 0, since crash recovery after a consistency point requires that data be loaded in sequence. | ||
dldelx | x is a single character DATALINK delimiter. The default
value is a semicolon (;). The specified character is used in place
of a semicolon as the inter-field separator for a DATALINK value. It is
needed because a DATALINK value may have more than one sub-value.
abc
| ||
fastparse | Reduced data checking is done on user-supplied column values, and
performance is enhanced. Tables loaded under this option are guaranteed
to be architecturally correct, and the utility is guaranteed to perform
sufficient data checking to prevent a segmentation violation or trap.
Data that is in correct form will be loaded correctly.
This option does not affect referential integrity checking or constraints checking; it merely reduces syntax checking of the supplied data. For example, if the value 123qwr4 were encountered as a field entry for an integer column in an ASC file, the load utility would ordinarily flag a syntax error, since the value does not represent a valid number. With fastparse, a syntax error is not detected, and an arbitrary number is loaded into the integer field. Care must be taken to use this modifier with clean data only. Performance improvements using this option with ASCII data can be quite substantial, but fastparse does not significantly enhance performance with PC/IXF data, since IXF is a binary format, and fastparse affects parsing and conversion from ASCII to internal forms. | ||
indexfreespace=x | x is an integer between 0 and 99 inclusive. The value
is interpreted as the percentage of each index page that is to be left as free
space when loading the index. The first entry in a page is added
without restriction; subsequent entries are added if the percent free space
threshold can be maintained. The default value is the one used at
CREATE INDEX time.
This value takes precedence over the PCTFREE value specified in the CREATE INDEX statement, and affects index leaf pages only. | ||
lobsinfile | lob-path specifies the path to the files containing LOB values. The ASC, DEL, or IXF load input files contain the names of the files having LOB data in the LOB column. | ||
noheader | Skips the header verification code.
db2split - Data Declustering Tool writes a header to each file contributing data to a table in a multi-node nodegroup. The header includes the node number, the partitioning map, and the partitioning key specification. The load utility requires this information to verify that the data is being loaded at the right node. When loading files into a table that exists on a single-node nodegroup, the headers do not exist, and this option causes the load utility to skip the header verification code. | ||
norowwarnings | Suppresses all warnings about rejected rows. | ||
pagefreespace=x | x is an integer between 0 and 100 inclusive. The value
is interpreted as the percentage of each data page that is to be left as free
space.
If the specified value is invalid because of the minimum row size, (for example, a row that is at least 3 000 bytes long, and an x value of 50), the row will be placed on a new page. If a value of 100 is specified, each row will reside on a new page.
| ||
totalfreespace=x | x is an integer between 0 and 100 inclusive. The value is interpreted as the percentage of the total pages in the table that is to be appended to the end of the table as free space. For example, if x is 20, and the table has 100 data pages, 20 additional empty pages will be appended. The total number of data pages for the table will be 120. | ||
usedefaults | If a source column for a target table column has been specified, but it
contains no data for one or more row instances, default values are
loaded. Examples of missing data are:
| ||
ASCII File Formats (ASC/DEL) | |||
codepage=x | x is an ASCII character string. The value is
interpreted as the code page of the data in the input data set.
Converts character data (and numeric data specified in characters) from this
code page to the database code page during the load operation.
The following rules apply:
| ||
dumpfile = x | x is the fully qualified (according to the server node) name
of an exception file to which rejected rows are written. A maximum of
32K of data is written per record. Following is an example that shows
how to specify a dump file:
db2 load from data of del modified by dumpfile = /u/user/filename insert into table_name Notes:
| ||
implieddecimal | The location of an implied decimal point is determined by the column definition; it is no longer assumed to be at the end of the value. For example, the value 12345 is loaded into a DECIMAL(8,2) column as 123.45, not 12345.00. | ||
noeofchar | The optional end-of-file character x'1A' is not recognized as the end of file. Processing continues as if it were a normal character. | ||
ASC (Non-delimited ASCII) File Format | |||
binarynumerics | Numeric (but not DECIMAL) data must be in binary form, not the character
representation. This avoids costly conversions.
This option is supported only with positional ASC, using fixed length records specified by the reclen option. The noeofchar option is assumed. The following rules apply:
| ||
nullindchar=x | x is a single character. Changes the character denoting
a null value to x. The default value of x is
Y.b
This modifier is case sensitive for EBCDIC data files, except when the character is an English letter. For example, if the null indicator character is specified to be the letter N, then n is also recognized as a null indicator. | ||
packeddecimal | Loads packed-decimal data directly, since the binarynumerics
modifier does not include the DECIMAL field type.
This option is supported only with positional ASC, using fixed length records specified by the reclen option. The noeofchar option is assumed. Supported values for the sign nibble are: + = 0xC 0xA 0xE 0xF - = 0xD 0xB
| ||
reclen=x | x is an integer with a maximum value of 32 767. x characters are read for each row, and a new-line character is not used to indicate the end of the row. | ||
striptblanks | Truncates any trailing blank spaces when loading data into a
variable-length field. If this option is not specified, blank spaces
are kept.
This option cannot be specified together with striptnulls. These are mutually exclusive options.
| ||
striptnulls | Truncates any trailing NULLs (0x00 characters) when loading data into a
variable-length field. If this option is not specified, NULLs are
kept.
This option cannot be specified together with striptblanks. These are mutually exclusive options.
| ||
DEL (Delimited ASCII) File Format | |||
chardelx | x is a single character string delimiter. The default
value is a double quotation mark ("). The specified character is
used in place of double quotation marks to enclose a character
string.ab
The single quotation mark (') can also be specified as a character string delimiter as follows: modified by chardel'' | ||
coldelx | x is a single character column delimiter. The default value is a comma (,). The specified character is used in place of a comma to signal the end of a column.ab | ||
datesiso | Date format. Causes all date data values to be loaded in ISO format. | ||
decplusblank | Plus sign character. Causes positive decimal values to be prefixed with a blank space instead of a plus sign (+). The default action is to prefix positive decimal values with a plus sign. | ||
decptx | x is a single character substitute for the period as a decimal point character. The default value is a period (.). The specified character is used in place of a period as a decimal point character.ab | ||
nodoubledel | Suppresses recognition of double character delimiters. | ||
IXF File Format | |||
forcein | Directs the utility to accept data despite code page mismatches, and to
suppress translation between code pages.
Fixed length target fields are checked to verify that they are large enough for the data. If nochecklengths is specified, no checking is done, and an attempt is made to load each row. | ||
nochecklengths | Used with the forcein modifier. If nochecklengths is specified, fixed length target fields are not checked to verify that they are large enough for the data, and an attempt is made to load each row. | ||
|
Remote file is a base file name to which DB2 appends different extensions to create files used by other commands (for example, .msg for LOAD QUERY).
The remote file resides on the server machine and is accessed by the DB2 instance exclusively. Therefore, it is imperative that any file name qualification given to this parameter reflects the directory structure of the server, not the client, and that the DB2 instance owner has read and write permission on this file. In addition, the user must ensure that two loads are not issued that have the same fully-qualified remote file name.
There are several ways that the remote file name can be selected and qualified when the user has just given a partially qualified name, or no name at all:
Note: | In an MPP system, the remote file must reside on a local disk, not on an NFS mount. If the file is on an NFS mount, there will be a significant performance decrement during the load operation. |
See Also