forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Lu, Huayi <luhuayi@huawei.com> Co-committed-by: Lu, Huayi <luhuayi@huawei.com>
84 lines
12 KiB
HTML
84 lines
12 KiB
HTML
<a name="EN-US_TOPIC_0000001233883383"></a><a name="EN-US_TOPIC_0000001233883383"></a>
|
|
|
|
<h1 class="topictitle1">Exporting Data In Parallel Using GDS</h1>
|
|
<div id="body8662426"><p id="EN-US_TOPIC_0000001233883383__p13720167353">In high-concurrency scenarios, you can use GDS to export data from a database to a common file system.</p>
|
|
<p id="EN-US_TOPIC_0000001233883383__p877219818236">In the current GDS version, data can be exported from a database to a pipe file.</p>
|
|
<ul id="EN-US_TOPIC_0000001233883383__ul1773711507206"><li id="EN-US_TOPIC_0000001233883383__li1743110228237">When the local disk space of the GDS user is insufficient:<ul id="EN-US_TOPIC_0000001233883383__ul2425112912316"><li id="EN-US_TOPIC_0000001233883383__li973735052019">The data exported from GDS is compressed using the pipe to occupy less disk space.</li><li id="EN-US_TOPIC_0000001233883383__li87371950102010">The exported data is transferred through the pipe to the HDFS server for storage.</li></ul>
|
|
</li><li id="EN-US_TOPIC_0000001233883383__li1260010253232">If you need to cleanse data before exporting data:<ul id="EN-US_TOPIC_0000001233883383__ul1530984192310"><li id="EN-US_TOPIC_0000001233883383__li81881239152312">You can compile programs as needed and read streaming data from pipes in real time.</li></ul>
|
|
<div class="note" id="EN-US_TOPIC_0000001233883383__note659914514569"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="EN-US_TOPIC_0000001233883383__ul2058214284424"><li id="EN-US_TOPIC_0000001233883383__li65827285425">The current version does not support data export through GDS in SSL mode. Do not use GDS in SSL mode.</li><li id="EN-US_TOPIC_0000001233883383__li1458214286429">All pipe files mentioned in this section refer to named pipes on Linux.</li></ul>
|
|
</div></div>
|
|
</li><li id="EN-US_TOPIC_0000001233883383__li137434572335">To ensure the correctness of data import or export using GDS, you need to import or export data in the same compatibility mode.<p id="EN-US_TOPIC_0000001233883383__p202801808340"><a name="EN-US_TOPIC_0000001233883383__li137434572335"></a><a name="li137434572335"></a>For example, data imported or exported in MySQL compatibility mode can be exported or imported only in MySQL compatibility mode.</p>
|
|
</li></ul>
|
|
<div class="section" id="EN-US_TOPIC_0000001233883383__s0eb156e0e50e43469368557fa6237a8d"><h4 class="sectiontitle">Overview</h4><div class="p" id="EN-US_TOPIC_0000001233883383__a9f340e1c6efb40b3bb88d3e368638a52"><strong id="EN-US_TOPIC_0000001233883383__en-us_topic_0125634753_b17281928133110">Using foreign tables</strong>: A GDS foreign table specifies the exported file format and export mode. Data is exported in parallel through multiple DNs from the database to data files, which improves the overall data export performance. The data files cannot be directly exported to HDFS.<ul id="EN-US_TOPIC_0000001233883383__u545a793e4f744274b0fc30428dbf7617"><li id="EN-US_TOPIC_0000001233883383__lc9270a11f41a4ec285354ac573ac4626">The CN only plans data export tasks and delivers the tasks to DNs. In this case, the CN is released to process other tasks.</li><li id="EN-US_TOPIC_0000001233883383__l5a62e2f9b7524f10a7fd1e416af4a10a">In this way, the computing capabilities and bandwidths of all the DNs are fully leveraged to export data.<div class="fignone" id="EN-US_TOPIC_0000001233883383__faa6a429cffc443c8b23b687f44db8243"><span class="figcap"><b>Figure 1 </b>Exporting data using foreign tables</span><br><span><img class="vsd" id="EN-US_TOPIC_0000001233883383__image943924313478" src="figure/en-us_image_0000001188163824.png"></span></div>
|
|
</li></ul>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="EN-US_TOPIC_0000001233883383__s0ce9198c80914a4aabb0769fb388569d"><h4 class="sectiontitle">Related Concepts</h4><ul id="EN-US_TOPIC_0000001233883383__u064f796c0c314c749a0ffc88fdd2e834"><li id="EN-US_TOPIC_0000001233883383__laf33991b3eaa44d3a3c64a8d4c23fb35"><strong id="EN-US_TOPIC_0000001233883383__b13185812538">Data file</strong>: A TEXT, CSV, or FIXED file that stores data exported from the <span id="EN-US_TOPIC_0000001233883383__text1297462206">GaussDB(DWS)</span> database.</li><li id="EN-US_TOPIC_0000001233883383__l48d3e6e5af30447184e5a03d60a436bf"><strong id="EN-US_TOPIC_0000001233883383__en-us_topic_0125634753_b3187537613">Foreign table</strong>: A table that stores information, such as the format, location, and encoding format of a data file.</li><li id="EN-US_TOPIC_0000001233883383__l32fd16b20b6e4e2787608c38be9e57d5"><strong id="EN-US_TOPIC_0000001233883383__en-us_topic_0125634753_b7676401110">GDS</strong>: A data service tool. To export data, deploy it on the server where data files are stored.</li><li id="EN-US_TOPIC_0000001233883383__li13547105312391"><strong id="EN-US_TOPIC_0000001233883383__b6664125515126">Table</strong>: Tables in the database, including row-store tables and column-store tables. Data in the data files is exported from these tables.</li><li id="EN-US_TOPIC_0000001233883383__lc795a776fe5a464b9a63076e957ce435"><strong id="EN-US_TOPIC_0000001233883383__en-us_topic_0125634753_en-us_topic_0117407694_b34029217143653">Remote mode</strong>: Service data in a cluster is exported to hosts outside the cluster.</li></ul>
|
|
</div>
|
|
<div class="section" id="EN-US_TOPIC_0000001233883383__s0baa8c8654354e13981fc84cd50fdbae"><h4 class="sectiontitle">Exporting a Schema</h4><p id="EN-US_TOPIC_0000001233883383__a5ba03e9ad43a4023a34a023c4c781000">Data can be exported to <span id="EN-US_TOPIC_0000001233883383__text361069532">GaussDB(DWS)</span> in <strong id="EN-US_TOPIC_0000001233883383__en-us_topic_0125634753_b49572498246">Remote</strong> mode.</p>
|
|
<ul id="EN-US_TOPIC_0000001233883383__u0f15c5731a2d4f1eb6fcfbd70160fd4f"><li id="EN-US_TOPIC_0000001233883383__lb52dbcf230d04b17af39beb60d4177ba"><strong id="EN-US_TOPIC_0000001233883383__en-us_topic_0125634753_b42511726151815">Remote mode</strong>: Service data in a cluster is exported to hosts outside the cluster.<ul id="EN-US_TOPIC_0000001233883383__u59b1684f7aad4854a76c98dc6d18f79e"><li id="EN-US_TOPIC_0000001233883383__lcd21a0a73fc64493b813834e639af6ba">In this mode, multiple GDSs are used to concurrently export data. One GDS can export data for only one cluster at a time.</li><li id="EN-US_TOPIC_0000001233883383__l0d4fd777d5fe40a48bda3008e980b523">The data export rate of a GDS that resides on the same intranet as cluster nodes is limited by the network bandwidth. A 10GE configuration is recommended.</li><li id="EN-US_TOPIC_0000001233883383__l20e67cc3bb94498b9921dae3df937a7b">Data files in TEXT, FIXED, or CSV format are supported. The size of data in a single row must be less than 1 GB.</li></ul>
|
|
</li></ul>
|
|
</div>
|
|
<div class="section" id="EN-US_TOPIC_0000001233883383__s76e0a90aeda246f9a76b5f905e43d6be"><h4 class="sectiontitle">Data Export Process</h4><div class="fignone" id="EN-US_TOPIC_0000001233883383__f8b56f4cb984c4943a73d1b64d3d9a7ea"><span class="figcap"><b>Figure 2 </b>Concurrent data export</span><br><span><img class="vsd" id="EN-US_TOPIC_0000001233883383__image9230152019563" src="figure/en-us_image_0000001233563377.png"></span></div>
|
|
<p id="EN-US_TOPIC_0000001233883383__ae6eb468ed28f4ae5bd2e705b96787b5c"></p>
|
|
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="EN-US_TOPIC_0000001233883383__t189e9578f6c04f028da5c323a57ba01a" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Process description</caption><thead align="left"><tr id="EN-US_TOPIC_0000001233883383__r2bebfd27c7dd46938f2fd4b17f26ce26"><th align="left" class="cellrowborder" valign="top" width="13%" id="mcps1.3.7.4.2.4.1.1"><p id="EN-US_TOPIC_0000001233883383__a666e69c75bf54bf2bf6bdf67c6ad963a">Process</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="48%" id="mcps1.3.7.4.2.4.1.2"><p id="EN-US_TOPIC_0000001233883383__a08a306c56eb5418697ab5e1e04b3ae0e">Description</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="39%" id="mcps1.3.7.4.2.4.1.3"><p id="EN-US_TOPIC_0000001233883383__ab95a0c717b764d2eb7ec234cf299f3cd">Subtask</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="EN-US_TOPIC_0000001233883383__r6598937e9c71421ea6343885ff04c182"><td class="cellrowborder" valign="top" width="13%" headers="mcps1.3.7.4.2.4.1.1 "><p id="EN-US_TOPIC_0000001233883383__a22dc68538b3645e5b72833720c0e8eca">Plan data export.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="48%" headers="mcps1.3.7.4.2.4.1.2 "><p id="EN-US_TOPIC_0000001233883383__en-us_topic_0117407695_p18314141324">Prepare data to be exported and plan the export path for the mode to be selected.</p>
|
|
<p id="EN-US_TOPIC_0000001233883383__a9f6c2add692a4683b3eb4f01768d79a4">For details, see <a href="dws_04_0263.html">Planning Data Export</a>.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.7.4.2.4.1.3 "><p id="EN-US_TOPIC_0000001233883383__en-us_topic_0117407695_p316511473812">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="EN-US_TOPIC_0000001233883383__r6b3296e569e74c168bbe614d82b23453"><td class="cellrowborder" valign="top" width="13%" headers="mcps1.3.7.4.2.4.1.1 "><p id="EN-US_TOPIC_0000001233883383__ab3d67011fc904f4d98d43225d3fdb9bf">Start GDS.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="48%" headers="mcps1.3.7.4.2.4.1.2 "><p id="EN-US_TOPIC_0000001233883383__ab8330df3e0034200935f9291e573ab4f">If the <strong id="EN-US_TOPIC_0000001233883383__b0908023432">Remote</strong> mode is selected, install, configure, and start GDS on data servers.</p>
|
|
<p id="EN-US_TOPIC_0000001233883383__a97e0396bb07d422a8c801b199b5fb1ae">For details, see <a href="dws_04_0264.html">Installing, Configuring, and Starting GDS</a>.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.7.4.2.4.1.3 "><p id="EN-US_TOPIC_0000001233883383__a2aec3613796f42ada43568f518e1d13c">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="EN-US_TOPIC_0000001233883383__r9624381d6a834d62a1975f6e6b27eccf"><td class="cellrowborder" valign="top" width="13%" headers="mcps1.3.7.4.2.4.1.1 "><p id="EN-US_TOPIC_0000001233883383__a0bc18807b6694efebeae7623ed79c7f1">Create a foreign table,</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="48%" headers="mcps1.3.7.4.2.4.1.2 "><p id="EN-US_TOPIC_0000001233883383__a8482b971ac6f4664ad366ccfe4d0d779">Create a foreign table to help GDS specify information about a data file. The foreign table stores information, such as the location, format, encoding, and inter-data delimiter of a data file.</p>
|
|
<p id="EN-US_TOPIC_0000001233883383__en-us_topic_0117407695_p895252637">For details, see <a href="dws_04_0265.html">Creating a GDS Foreign Table</a>.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.7.4.2.4.1.3 "><p id="EN-US_TOPIC_0000001233883383__a2678e2fbe68a4c90a3fbec3fd1958a1e">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="EN-US_TOPIC_0000001233883383__r7060b34296ee4b35bb5d092b7bcd3c6f"><td class="cellrowborder" valign="top" width="13%" headers="mcps1.3.7.4.2.4.1.1 "><p id="EN-US_TOPIC_0000001233883383__aa564dea4040746669f6dffff788d29b6">Export data.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="48%" headers="mcps1.3.7.4.2.4.1.2 "><p id="EN-US_TOPIC_0000001233883383__ab67af7c12e824212888ceca15c5ba61a">After the foreign table is created, run the <strong id="EN-US_TOPIC_0000001233883383__b540410418435">INSERT</strong> statement to efficiently export data to data files.</p>
|
|
<p id="EN-US_TOPIC_0000001233883383__en-us_topic_0117407695_p137029378231">For details, see <a href="dws_04_0266.html">Exporting Data</a>.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.7.4.2.4.1.3 "><p id="EN-US_TOPIC_0000001233883383__ab498ebec522842b8afb5d9f54de0694f">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="EN-US_TOPIC_0000001233883383__r232184fde1844a00ae91b43fcc81c7c5"><td class="cellrowborder" valign="top" width="13%" headers="mcps1.3.7.4.2.4.1.1 "><p id="EN-US_TOPIC_0000001233883383__aae89b4b37d2042ac869419f746c79fa3">Stop GDS.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="48%" headers="mcps1.3.7.4.2.4.1.2 "><p id="EN-US_TOPIC_0000001233883383__a6e3268fc1a0944dabbc233e6dc6216e3">Stop GDS after data is exported.</p>
|
|
<p id="EN-US_TOPIC_0000001233883383__a289642c8ae4b419fba48a19383fb7a40">For details, see <a href="dws_04_0267.html">Stopping GDS</a>.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.7.4.2.4.1.3 "><p id="EN-US_TOPIC_0000001233883383__a82adccb5d3ec431c8b53b7872cbd12e1">-</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dws_04_0261.html">Using GDS to Export Data to a Remote Server</a></div>
|
|
</div>
|
|
</div>
|
|
|