COPY copies data between tables and files.
COPY FROM copies data from a file to a table. COPY TO copies data from a table to a file.
1 2 3 4 5 6 7 8 9 10 | COPY table_name [ ( column_name [, ...] ) ] FROM { 'filename' | STDIN } [ [ USING ] DELIMITERS 'delimiters' ] [ WITHOUT ESCAPING ] [ LOG ERRORS ] [ LOG ERRORS data ] [ REJECT LIMIT 'limit' ] [ [ WITH ] ( option [, ...] ) ] | copy_option | FIXED FORMATTER ( { column_name( offset, length ) } [, ...] ) [ ( option [, ...] ) | copy_option [ ...] ] ]; |
In the SQL syntax, FIXED, FORMATTER ( { column_name( offset, length ) } [, ...] ), and [ ( option [, ...] ) | copy_option [ ...] ] can be in any sequence.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | COPY table_name [ ( column_name [, ...] ) ] TO { 'filename' | STDOUT } [ [ USING ] DELIMITERS 'delimiters' ] [ WITHOUT ESCAPING ] [ [ WITH ] ( option [, ...] ) ] | copy_option | FIXED FORMATTER ( { column_name( offset, length ) } [, ...] ) [ ( option [, ...] ) | copy_option [ ...] ] ]; COPY query TO { 'filename' | STDOUT } [ WITHOUT ESCAPING ] [ [ WITH ] ( option [, ...] ) ] | copy_option | FIXED FORMATTER ( { column_name( offset, length ) } [, ...] ) [ ( option [, ...] ) | copy_option [ ...] ] ]; |
(query) is incompatible with [USING] DELIMITER. If the data of COPY TO comes from a query result, COPY TO cannot specify [USING] DELIMITERS.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | FORMAT 'format_name' | OIDS [ boolean ] | DELIMITER 'delimiter_character' | NULL 'null_string' | HEADER [ boolean ] | FILEHEADER 'header_file_string' | FREEZE [ boolean ] | QUOTE 'quote_character' | ESCAPE 'escape_character' | EOL 'newline_character' | NOESCAPING [ boolean ] | FORCE_QUOTE { ( column_name [, ...] ) | * } | FORCE_NOT_NULL ( column_name [, ...] ) | ENCODING 'encoding_name' | IGNORE_EXTRA_DATA [ boolean ] | FILL_MISSING_FIELDS [ boolean ] | COMPATIBLE_ILLEGAL_CHARS [ boolean ] | DATE_FORMAT 'date_format_string' | TIME_FORMAT 'time_format_string' | TIMESTAMP_FORMAT 'timestamp_format_string' | SMALLDATETIME_FORMAT 'smalldatetime_format_string' |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | OIDS | NULL 'null_string' | HEADER | FILEHEADER 'header_file_string' | FREEZE | FORCE_NOT_NULL column_name [, ...] | FORCE_QUOTE { column_name [, ...] | * } | BINARY | CSV | QUOTE [ AS ] 'quote_character' | ESCAPE [ AS ] 'escape_character' | EOL 'newline_character' | ENCODING 'encoding_name' | IGNORE_EXTRA_DATA | FILL_MISSING_FIELDS | COMPATIBLE_ILLEGAL_CHARS | DATE_FORMAT 'date_format_string' | TIME_FORMAT 'time_format_string' | TIMESTAMP_FORMAT 'timestamp_format_string' | SMALLDATETIME_FORMAT 'smalldatetime_format_string' |
Indicates that the results are to be copied.
Value range: a SELECT or VALUES command in parentheses
Specifies the name (optionally schema-qualified) of an existing table.
Value range: an existing table name
Indicates an optional list of columns to be copied.
Value range: If no column list is specified, all columns of the table will be copied.
Indicates that the input comes from the client application.
Indicates that output goes to the client application.
Fixes column length. When the column length is fixed, DELIMITER, NULL, and CSV cannot be specified. When FIXED is specified, BINARY, CSV, and TEXT cannot be specified by option or copy_option.
The definition of fixed length:
The string that separates columns within each row (line) of the file, and it cannot be larger than 10 bytes.
Value range: The delimiter cannot include any of the following characters: \.abcdefghijklmnopqrstuvwxyz0123456789
Value range: The default value is a tab character in text format and a comma in CSV format.
In TEXT, do not escape a backslash (\) and the characters that follow it.
Value range: text only.
If this parameter is specified, the error tolerance mechanism for data type errors in the COPY FROM statement is enabled. Row errors are recorded in the public.pgxc_copy_error_log table in the database for future reference.
Value range: A value set while data is imported using COPY FROM.
The restrictions of this error tolerance parameter are as follows:
The differences between LOG ERRORS DATA and LOG ERRORS are as follows:
If error content is too complex, it may fail to be written to the error tolerance table by using LOG ERRORS DATA, causing the task failure.
Used with the LOG ERROR parameter to set the upper limit of the tolerated errors in the COPY FROM statement. If the number of errors exceeds the limit, later errors will be reported based on the original mechanism.
Value range: a positive integer (1 to INTMAX) or unlimited
Default value: If LOG ERRORS is not specified, an error will be reported. If LOG ERRORS is specified, the default value is 0.
Different from the GDS error tolerance mechanism, in the error tolerance mechanism described in the description of LOG ERRORS, the count of REJECT LIMIT is calculated based on the number of data parsing errors on the CN where the COPY FROM statement is run, not based on the number of errors on each DN.
Defining the location of each column in the data file in fixed length mode. Defining the place of each column in the data file based on column (offset, length) format.
Value range:
The total length of all columns must be less than 1 GB.
Replace columns that are not in the file with NULL.
Specifies all types of parameters of a compatible foreign table.
Specifies the format of the source data file in the foreign table.
Value range: CSV, TEXT, FIXED, and BINARY.
Default value: TEXT
An error is raised if OIDs are specified for a table that does not have OIDs, or in the case of copying a query.
Value range: true, on, false, and off
Default value: false
Value range: multi-character delimiter within 10 bytes.
Default value:
Specifies the string that represents a null value.
Value range:
Default value:
Specifies whether a file contains a header with the names of each column in the file. header is available only for CSV and FIXED files.
When data is imported, if header is on, the first row of the data file will be identified as title row and ignored. If header is off, the first row is identified as data.
When data is exported, if header is on, fileheader must be specified. If header is off, the exported file does not include a title row.
Value range: true, on, false, and off
Default value: false
Specifies the quote character for a CSV file.
Default value: double quotation mark ("")
This option is allowed only when using CSV format. This must be a single one-byte character.
Default value: the same as the value of QUOTE
Specifies the newline character style of the imported or exported data file.
Value range: multi-character newline characters within 10 bytes. Common newline characters include \r (0x0D), \n (0x0A), and \r\n (0x0D0A). Special newline characters include $ and #.
Forces quoting to be used for all non-null values in each specified column. This option is allowed only in COPY TO, and only when using the CSV format. NULL values are not quoted.
Value range: an existing column
Does not match the specified columns' values against the null string. This option is allowed only in COPY FROM, and only when using the CSV format.
Value range: an existing column
Specifies that the file is encoded in the encoding_name. If this option is omitted, the current encoding format is used by default.
When the number of data source files exceeds the number of foreign table columns, whether ignoring excessive columns at the end of the row. This parameter is available only during data importing.
Value range: true/on, false/off.
1 | extra data after last expected column |
Default value: false
If the newline character at the end of the row is lost, setting the parameter to true will ignore data in the next row.
Enables or disables fault tolerance on invalid characters during importing. This parameter is available only for COPY FROM.
Value range: true, on, false, and off
Default value: false or off
The rule of error tolerance when you import invalid characters is as follows:
(1) \0 is converted to a space.
(2) Other invalid characters are converted to question marks.
(3) If compatible_illegal_chars is set to true or on, invalid characters are tolerated. If NULL, DELIMITER, QUOTE, and ESCAPE are set to a spaces or question marks. Errors like "illegal chars conversion may confuse COPY escape 0x20" will be displayed to prompt user to modify parameter values that cause confusion, preventing import errors.
Specifies whether to generate an error message when the last column in a row in the source file is lost during data loading.
Value range: true, on, false, and off
Default value: false or off
Imports data of the DATE type. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data importing using the COPY FROM option.
Value range: any valid DATE value. For details, see Date and Time Processing Functions and Operators.
If ORACLE is specified as the compatible database, the DATE format is TIMESTAMP. For details, see timestamp_format below.
Imports data of the TIME type. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data importing using the COPY FROM option.
Value range: Valid TIME. Time zones cannot be used. For details, see Date and Time Processing Functions and Operators.
Imports data of the TIMESTAMP type. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data importing using the COPY FROM option.
Value range: any valid TIMESTAMP value. Time zones are not supported. For details, see Date and Time Processing Functions and Operators.
Imports data of the SMALLDATETIME type. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data importing using the COPY FROM option.
Value range: any valid SMALLDATETIME value. For details, see Date and Time Processing Functions and Operators.
Specifies all types of native parameters of COPY.
When using COPY FROM, any data item that matches this string will be stored as a NULL value, so you should make sure that you use the same string as you used with COPY TO.
Value range:
Default value:
Specifies whether a file contains a header with the names of each column in the file. header is available only for CSV and FIXED files.
When data is imported, if header is on, the first row of the data file will be identified as title row and ignored. If header is off, the first row is identified as data.
When data is exported, if header is on, fileheader must be specified. If header is off, the exported file does not include a title row.
Specifies a file that defines the content in the header for exported data. The file contains data description of each column.
Sets the COPY loaded data row as frozen, like these data have executed VACUUM FREEZE.
This is a performance option of initial data loading. The data will be frozen only when the following three requirements are met:
When COPY is completed, all the other sessions will see the data immediately. This violates the normal rules of MVCC visibility and users should be aware of the potential problems this might cause.
Does not match the specified columns' values against the null string. This option is allowed only in COPY FROM, and only when using the CSV format.
Value range: an existing column
Forces quoting to be used for all non-NULL values in each specified column. This option is allowed only in COPY TO, and only when using the CSV format. NULL values are not quoted.
Value range: an existing column
The binary format option causes all data to be stored/read as binary format rather than as text. In binary mode, you cannot declare DELIMITER, NULL, or CSV. After specifying BINARY, CSV, FIXED and TEXT cannot be specified through option or copy_option.
Enables the CSV mode. After CSV is specified, BINARY, FIXED and TEXT cannot be specified through option or copy_option.
Specifies the quote character for a CSV file.
Default value: double quotation mark ("")
This option is allowed only when using CSV format. This must be a single one-byte character.
The default value is a double quotation mark ("). If it is the same as the value of quote, it will be replaced with \0.
Specifies the newline character style of the imported or exported data file.
Value range: multi-character newline characters within 10 bytes. Common newline characters include \r (0x0D), \n (0x0A), and \r\n (0x0D0A). Special newline characters include $ and #.
Specifies that the file is encoded in the encoding_name.
Value range: a valid encoding format
Default value: current encoding format of the database
When the number of data source files exceeds the number of foreign table columns, excess columns at the end of the row are ignored. This parameter is available only during data importing.
1 | extra data after last expected column |
Specifies error tolerance for invalid characters during importing. Invalid characters are converted before importing. No error message is displayed. The import is not interrupted. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data importing using the COPY FROM option.
If you do not use this parameter, an error occurs when there is an invalid character, and the import stops.
The rule of error tolerance when you import invalid characters is as follows:
(1) \0 is converted to a space.
(2) Other invalid characters are converted to question marks.
(3) Setting compatible_illegal_chars to true/on enables toleration of invalid characters. If NULL, DELIMITER, QUOTE, and ESCAPE are set to spaces or question marks, errors like "illegal chars conversion may confuse COPY escape 0x20" will be displayed to prompt the user to modify parameters that may cause confusion, preventing importing errors.
Specifies whether to generate an error message when the last column in a row in the source file is lost during data loading.
Value range: true, on, false, and off
Default value: false or off
Do not specify this option. Currently, it does not enable error tolerance, but will make the parser ignore the said errors during data parsing on the CN. Such errors will not be recorded in the COPY error table (enabled using LOG ERRORS REJECT LIMIT) but will be reported later by DNs.
Imports data of the DATE type. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data importing using the COPY FROM option.
Value range: any valid DATE value. For details, see Date and Time Processing Functions and Operators.
If ORACLE is specified as the compatible database, the DATE format is TIMESTAMP. For details, see timestamp_format below.
Imports data of the TIME type. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data importing using the COPY FROM option.
Value range: Valid TIME. Time zones cannot be used. For details, see Date and Time Processing Functions and Operators.
Specifies the TIMESTAMP format for data import. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data importing using the COPY FROM option.
Value range: any valid TIMESTAMP value. Time zones are not supported. For details, see Date and Time Processing Functions and Operators.
Imports data of the SMALLDATETIME type. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data importing using the COPY FROM option.
Value range: any valid SMALLDATETIME value. For details, see Date and Time Processing Functions and Operators.
1 | COPY tpcds.ship_mode TO '/home/omm/ds_ship_mode.dat'; |
1 | COPY tpcds.ship_mode TO stdout; |
1 2 3 4 5 6 7 8 9 10 11 | CREATE TABLE tpcds.ship_mode_t1 ( SM_SHIP_MODE_SK INTEGER NOT NULL, SM_SHIP_MODE_ID CHAR(16) NOT NULL, SM_TYPE CHAR(30) , SM_CODE CHAR(10) , SM_CARRIER CHAR(20) , SM_CONTRACT CHAR(20) ) WITH (ORIENTATION = COLUMN,COMPRESSION=MIDDLE) DISTRIBUTE BY HASH(SM_SHIP_MODE_SK ); |
1 | COPY tpcds.ship_mode_t1 FROM stdin; |
Copy data from the /home/omm/ds_ship_mode.dat file to the tpcds.ship_mode_t1 table.
1 | COPY tpcds.ship_mode_t1 FROM '/home/omm/ds_ship_mode.dat'; |
Copy data from the /home/omm/ds_ship_mode.dat file to the tpcds.ship_mode_t1 table, with the import format set to TEXT (format 'text'), the delimiter set to \t' (delimiter E'\t'), excessive columns ignored (ignore_extra_data 'true'), and characters not escaped (noescaping 'true').
1 | COPY tpcds.ship_mode_t1 FROM '/home/omm/ds_ship_mode.dat' WITH(format 'text', delimiter E'\t', ignore_extra_data 'true', noescaping 'true'); |
Copy data from the /home/omm/ds_ship_mode.dat file to the tpcds.ship_mode_t1 table, with the import format set to FIXED, fixed-length format specified (FORMATTER(SM_SHIP_MODE_SK(0, 2), SM_SHIP_MODE_ID(2,16), SM_TYPE(18,30), SM_CODE(50,10), SM_CARRIER(61,20), SM_CONTRACT(82,20))), excessive columns ignored (ignore_extra_data), and headers included (header).
1 | COPY tpcds.ship_mode_t1 FROM '/home/omm/ds_ship_mode.dat' FIXED FORMATTER(SM_SHIP_MODE_SK(0, 2), SM_SHIP_MODE_ID(2,16), SM_TYPE(18,30), SM_CODE(50,10), SM_CARRIER(61,20), SM_CONTRACT(82,20)) header ignore_extra_data; |
Delete the tpcds.ship_mode_t1 table.
1 | DROP TABLE tpcds.ship_mode_t1; |