Parallel OBS Data Export

Overview

GaussDB(DWS) databases allow you to export data in parallel using OBS foreign tables, in which the export mode and the exported data format are specified. Data is exported in parallel through multiple DNs from GaussDB(DWS) to the OBS server, improving the overall export performance.
  • The CN only plans data export tasks and delivers the tasks to DNs for execution. In this case, the CN is released to process external requests.
  • Every DN is involved in data export, and the computing capabilities and bandwidths of all the DNs are fully leveraged to export data.
  • You can concurrently export data using multiple OBS services, but the bucket and object paths specified for the export tasks must be different and cannot be null.
  • The OBS server connects to GaussDB(DWS) cluster nodes. The export rate is affected by the network bandwidth.
  • The TEXT and CSV data file formats are supported. The size of data in a single row must be less than 1 GB.
  • Data in ORC format is supported only by 8.1.0 or later.
  • To ensure the correctness of data import or export, you need to import or export data from OBS in the same compatibility mode.

    For example, data imported or exported in MySQL compatibility mode can be exported or imported only in MySQL compatibility mode.

Related Concepts

Principles

The following describes the principles of exporting data from a cluster to OBS by using a distributed hash table or a replication table.

Naming Rules of Exported Files

Rules for naming the files exported from GaussDB(DWS) to OBS are as follows:

Data Export Process

Figure 2 Concurrent data export
Table 1 Process description

Procedure

Description

Subtask

Plan data export.

Create an OBS bucket and a folder in the OBS bucket as the directory for storing exported data files.

For details, see Planning Data Export.

-

Create an OBS foreign table.

Create a foreign table to help OBS specify information about data files to be exported. The foreign table stores information, such as the destination location, format, encoding, and data delimiter of a source data file.

For details, see Creating an OBS Foreign Table.

-

Export data.

After the foreign table is created, run the INSERT statement to efficiently export data to data files.

For details, see Exporting Data.

-