forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Lu, Huayi <luhuayi@huawei.com> Co-committed-by: Lu, Huayi <luhuayi@huawei.com>
168 lines
39 KiB
HTML
168 lines
39 KiB
HTML
<a name="EN-US_TOPIC_0000001098974516"></a><a name="EN-US_TOPIC_0000001098974516"></a>
|
|
|
|
<h1 class="topictitle1">Examples of Exporting Data Using GDS</h1>
|
|
<div id="body8662426"><div class="section" id="EN-US_TOPIC_0000001098974516__sdcad1f9356214a9f8912535e46f27537"><h4 class="sectiontitle">Exporting Data in Remote Mode</h4><p id="EN-US_TOPIC_0000001098974516__a4bf3b2440fbf4f52b6c6639e14224786">The data server and the cluster reside on the same intranet, the IP address of the data server is <strong id="EN-US_TOPIC_0000001098974516__b103731546123919">192.168.0.90</strong>, and data source files are in CSV format. In this scenario, data is exported in parallel in <strong id="EN-US_TOPIC_0000001098974516__b9116185712418">Remote</strong> mode.</p>
|
|
<p id="EN-US_TOPIC_0000001098974516__a46c3af2df2be45d4ba98deb083e90924">To export data in parallel in <strong id="EN-US_TOPIC_0000001098974516__b19241185122518">Remote</strong> mode, perform the following operations:</p>
|
|
<ol id="EN-US_TOPIC_0000001098974516__o9a0388cdf1e542d287a5540ae7baf502"><li id="EN-US_TOPIC_0000001098974516__ld4537905065b4df3aa5e83bf33ac481f">Log in to the GDS data server as user <strong id="EN-US_TOPIC_0000001098974516__b9305631141213">root</strong>, create the <strong id="EN-US_TOPIC_0000001098974516__b830517319124">/output_data</strong> directory for storing data files, and create user <strong id="EN-US_TOPIC_0000001098974516__b963043161711">gds_user</strong> and its user group.<pre class="screen" id="EN-US_TOPIC_0000001098974516__s78555e7342fb451fb59b0bd36209795b"><strong id="EN-US_TOPIC_0000001098974516__a23be3c8547284743a1b143ca7efd378a">mkdir -p</strong> <em id="EN-US_TOPIC_0000001098974516__aa153073350c74a2f9c5bd6230a271581">/output_data</em></pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__l1b57a31aefde416980683b316b21816d">(Optional) Create a user and the user group it belongs to. The user is used to start GDS. If the user and user group exist, skip this step.<pre class="screen" id="EN-US_TOPIC_0000001098974516__sbe89ef3a703a45a1a69f1dad29006ed6"><strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0117407660_b2851994648">groupadd</strong> gdsgrp
|
|
<strong id="EN-US_TOPIC_0000001098974516__a854bc4c25d4b4ce6800680e1c0f2d72f">useradd -g</strong> gdsgrp gds_user</pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__l24130c54425d46a0a6649eb9330ac0e8">Change the owner of the <strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0085032744_b372029376172051">/output_data</strong> directory on the data server to <strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0085032744_b84235270610312">gds_user</strong>.<pre class="screen" id="EN-US_TOPIC_0000001098974516__s84dc6d0d43b647278c078f1f8af92efa"><strong id="EN-US_TOPIC_0000001098974516__a1e14a011dd4849ab95a70db024cacc9c">chown -R</strong> <em id="EN-US_TOPIC_0000001098974516__a02c0385d6ba74e98b54b80e6b4db3d25">gds_user:gdsgrp /output_data </em></pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__lcb5f581a8d1c49859759f52e647f1e68">Log in to the data server as user <strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0085032744_b372029376172120">gds_user</strong> and start GDS.<div class="p" id="EN-US_TOPIC_0000001098974516__a25f16b50be834206b5d988f0bf25b4c8">The GDS installation path is <strong id="EN-US_TOPIC_0000001098974516__b4642519162812">/opt/bin/dws/gds</strong>. Exported data files are stored in <strong id="EN-US_TOPIC_0000001098974516__b76429192285">/output_data/</strong>. The IP address of the data server is <strong id="EN-US_TOPIC_0000001098974516__b1627094544215">192.168.0.90</strong>. The GDS listening port is <strong id="EN-US_TOPIC_0000001098974516__b83568484425">5000</strong>. GDS runs in daemon mode.<pre class="screen" id="EN-US_TOPIC_0000001098974516__sda5fd2f566f840d4887090dcc815b2bd"><strong id="EN-US_TOPIC_0000001098974516__a7cffb6b04721468aa19322ab967c573f">/opt/bin/dws/gds/bin/gds -d</strong> <em id="EN-US_TOPIC_0000001098974516__ab7dacd8be40f4c3f8906ad0e8f085032">/output_data</em> <strong id="EN-US_TOPIC_0000001098974516__aaf55e94951a2456d9f50bc1342822508">-p </strong><em id="EN-US_TOPIC_0000001098974516__a5baed96fa11b4c9cad6c9f577e46bcc6">192.168.0.90:5000 </em><strong id="EN-US_TOPIC_0000001098974516__ad6a1247535d7453a97290fb3afac2c14">-H</strong><em id="EN-US_TOPIC_0000001098974516__a1312cd8328c04165beefbd2330908f09"> </em>10.10.0.1/24<em id="EN-US_TOPIC_0000001098974516__a3219e766c45b4bd79ae9062c594f8a80"> </em><strong id="EN-US_TOPIC_0000001098974516__a604c4118ef934467b1aa0a17bc61c477">-D</strong></pre>
|
|
</div>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__l955109db195e4df0b8cb0c8302b0a4db">In the database, create the foreign table <strong id="EN-US_TOPIC_0000001098974516__b10909105514010">foreign_tpcds_reasons</strong> for receiving data from the data server.<p id="EN-US_TOPIC_0000001098974516__a74b966859e4d466392e1a4c012105b9f">Data export mode settings are as follows:</p>
|
|
<ul id="EN-US_TOPIC_0000001098974516__u6fbf09b30fa24a02aff41c2e4e3421e0"><li id="EN-US_TOPIC_0000001098974516__l2ab20bef959246f099a6f1f79d706d81">The directory for storing exported files is <strong id="EN-US_TOPIC_0000001098974516__b372029376172233">/output_data/</strong> and the GDS listening port is <strong id="EN-US_TOPIC_0000001098974516__b189631964441">5000</strong> when GDS is started. The directory created for storing exported files is <strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0085032744_b372029376172254">/output_data/</strong>. Therefore, the <strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0085032744_b37202937617234">location</strong> parameter is set to <strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0085032744_b37202937617239">gsfs://192.168.0.90:5000/</strong>.</li></ul>
|
|
<p id="EN-US_TOPIC_0000001098974516__a532202bd8e50499fb6c472047f1d7266">Data format parameter settings are as follows:</p>
|
|
<ul id="EN-US_TOPIC_0000001098974516__uc2215176ef004d86a5851eb452470f5b"><li id="EN-US_TOPIC_0000001098974516__l7ca5051d0da343948517138d7b9818d6"><strong id="EN-US_TOPIC_0000001098974516__b842352706165627">format</strong> is set to <strong id="EN-US_TOPIC_0000001098974516__b842352706165631">CSV</strong>.</li><li id="EN-US_TOPIC_0000001098974516__lbf7f52ea70814df0bc8ba44c84711fce"><strong id="EN-US_TOPIC_0000001098974516__b842352706143434">encoding</strong> is set to <strong id="EN-US_TOPIC_0000001098974516__b842352706161148">UTF-8</strong>.</li><li id="EN-US_TOPIC_0000001098974516__ld341b52fd0584aa494c6961abc82c5ca"><strong id="EN-US_TOPIC_0000001098974516__b147317101574">delimiter</strong> is set to <strong id="EN-US_TOPIC_0000001098974516__b14739104719">E'\x0a'</strong>.</li><li id="EN-US_TOPIC_0000001098974516__l329b644c674644e88bd3a522add2450d"><strong id="EN-US_TOPIC_0000001098974516__b1330801017354">quote</strong> is set to <strong id="EN-US_TOPIC_0000001098974516__b173081610163512">E'\x1b'</strong>.</li><li id="EN-US_TOPIC_0000001098974516__l3baf5e5eedd14a2cabeb02bf5ba34eed"><strong id="EN-US_TOPIC_0000001098974516__b554252501012">null</strong> is set to an empty string without quotation marks.</li><li id="EN-US_TOPIC_0000001098974516__li79983214235"><strong id="EN-US_TOPIC_0000001098974516__b3394931124512">escape</strong> defaults to the value of <strong id="EN-US_TOPIC_0000001098974516__b1639443117453">quote</strong>.</li><li id="EN-US_TOPIC_0000001098974516__l6e7cbd6e683c45cfaa47999c9a149d05"><strong id="EN-US_TOPIC_0000001098974516__b151131842133616">header</strong> is set to <strong id="EN-US_TOPIC_0000001098974516__b16113104213619">false</strong>, indicating that the first row is identified as a data row in an exported file.</li></ul>
|
|
<p id="EN-US_TOPIC_0000001098974516__a72f693e13d40407dba47e20ce9543f98">Based on the above settings, the foreign table is created using the following statement:</p>
|
|
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001098974516__sf4b7b16acd6345a8a9bfbb644aed3130"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
|
|
<span class="normal">2</span>
|
|
<span class="normal">3</span>
|
|
<span class="normal">4</span>
|
|
<span class="normal">5</span>
|
|
<span class="normal">6</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="k">FOREIGN</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">foreign_tpcds_reasons</span><span class="w"></span>
|
|
<span class="p">(</span><span class="w"></span>
|
|
<span class="w"> </span><span class="n">r_reason_sk</span><span class="w"> </span><span class="nb">integer</span><span class="w"> </span><span class="k">not</span><span class="w"> </span><span class="k">null</span><span class="p">,</span><span class="w"></span>
|
|
<span class="w"> </span><span class="n">r_reason_id</span><span class="w"> </span><span class="nb">char</span><span class="p">(</span><span class="mi">16</span><span class="p">)</span><span class="w"> </span><span class="k">not</span><span class="w"> </span><span class="k">null</span><span class="p">,</span><span class="w"></span>
|
|
<span class="w"> </span><span class="n">r_reason_desc</span><span class="w"> </span><span class="nb">char</span><span class="p">(</span><span class="mi">100</span><span class="p">)</span><span class="w"></span>
|
|
<span class="p">)</span><span class="w"> </span><span class="n">SERVER</span><span class="w"> </span><span class="n">gsmpp_server</span><span class="w"> </span><span class="k">OPTIONS</span><span class="w"> </span><span class="p">(</span><span class="k">LOCATION</span><span class="w"> </span><span class="s1">'gsfs://192.168.0.90:5000/'</span><span class="p">,</span><span class="w"> </span><span class="n">FORMAT</span><span class="w"> </span><span class="s1">'CSV'</span><span class="p">,</span><span class="k">ENCODING</span><span class="w"> </span><span class="s1">'utf8'</span><span class="p">,</span><span class="k">DELIMITER</span><span class="w"> </span><span class="n">E</span><span class="s1">'\x08'</span><span class="p">,</span><span class="w"> </span><span class="n">QUOTE</span><span class="w"> </span><span class="n">E</span><span class="s1">'\x1b'</span><span class="p">,</span><span class="w"> </span><span class="k">NULL</span><span class="w"> </span><span class="s1">''</span><span class="p">)</span><span class="w"> </span><span class="k">WRITE</span><span class="w"> </span><span class="k">ONLY</span><span class="p">;</span><span class="w"></span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__l9dbbfaf5608c424f91983b2da930e214">In the database, export data to data files through the foreign table <strong id="EN-US_TOPIC_0000001098974516__b1439165213111">foreign_tpcds_reasons</strong>.<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001098974516__s923bd88f69064af2bb3b61dd87416741"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">INSERT</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="n">foreign_tpcds_reasons</span><span class="w"> </span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">tpcds</span><span class="p">.</span><span class="n">reason</span><span class="p">;</span><span class="w"></span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__l83cce3771fcd4e1e835f2f47c6e25001">After data export is complete, log in to the data server as user <strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0085032744_b372029376175142">gds_user</strong> and stop GDS.<div class="p" id="EN-US_TOPIC_0000001098974516__a4d1bf02bbbca4499aded2e9fb0d172ba">The GDS process ID is <strong id="EN-US_TOPIC_0000001098974516__b11482734465">128954</strong>.<pre class="screen" id="EN-US_TOPIC_0000001098974516__s4fe052d30dba4624b044380a78d960bf"><strong id="EN-US_TOPIC_0000001098974516__a80e8531fc10d4d5b9fb792461adccd71">ps -ef|grep gds</strong>
|
|
gds_user <strong id="EN-US_TOPIC_0000001098974516__a82e254a5f08649ed8a10feff5c3024dd">128954</strong> 1 0 15:03 ? 00:00:00 gds -d /output_data -p 192.168.0.90:5000 -D
|
|
gds_user 129003 118723 0 15:04 pts/0 00:00:00 grep gds
|
|
<strong id="EN-US_TOPIC_0000001098974516__a2894f2c108a74833b2faf77d3607a60f">kill -9</strong> 128954</pre>
|
|
</div>
|
|
</li></ol>
|
|
</div>
|
|
<div class="section" id="EN-US_TOPIC_0000001098974516__s855daf73006d4e05ba6d04f8db74e7f6"><a name="EN-US_TOPIC_0000001098974516__s855daf73006d4e05ba6d04f8db74e7f6"></a><a name="s855daf73006d4e05ba6d04f8db74e7f6"></a><h4 class="sectiontitle">Exporting Data Using Multiple Threads</h4><p id="EN-US_TOPIC_0000001098974516__a6ea339cc06174f5783ef17efa4e0c072">The data server and the cluster reside on the same intranet, the IP address of the data server is <strong id="EN-US_TOPIC_0000001098974516__b6259329194715">192.168.0.90</strong>, and data source files are in CSV format. In this scenario, data is concurrently exported to two target tables using multiple threads in <strong id="EN-US_TOPIC_0000001098974516__b186946719490">Remote</strong> mode.</p>
|
|
<p id="EN-US_TOPIC_0000001098974516__afc0d8043841e415980397eb4daeb63f2">To concurrently export data using multiple threads in <strong id="EN-US_TOPIC_0000001098974516__b2092018175413">Remote</strong> mode, perform the following operations:</p>
|
|
<ol id="EN-US_TOPIC_0000001098974516__o2b7bb4d1b885432cb7dc251831f558da"><li id="EN-US_TOPIC_0000001098974516__ldb4da38e458c47d88063fa270e2c566b">Log in to the GDS data server as user <strong id="EN-US_TOPIC_0000001098974516__b57682817226">root</strong>, create the <strong id="EN-US_TOPIC_0000001098974516__b2076910817228">/output_data</strong> directory for storing data files, and create the database user and its user group.<pre class="screen" id="EN-US_TOPIC_0000001098974516__s2b312a2cdd5c45f4ba1a775ea26de6a6"><strong id="EN-US_TOPIC_0000001098974516__a18cbaf9935d44edfb195ef46d8e4ea8d">mkdir -p</strong> <em id="EN-US_TOPIC_0000001098974516__a0f7e73d88ea64b4e81d4de0dd087e956">/output_data</em>
|
|
<strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0117407669_b20932951624">groupadd</strong> gdsgrp
|
|
<strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0117407669_b439058901624">useradd</strong> -g <em id="EN-US_TOPIC_0000001098974516__en-us_topic_0117407669_i667161761624">gdsgrp gds_user</em></pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__l5db37895dbaf4e58b4321d494e5d1342">Change the owner of the <strong id="EN-US_TOPIC_0000001098974516__b1355114714">/output_data</strong> directory on the data server to <strong id="EN-US_TOPIC_0000001098974516__b929165205">gds_user</strong>.<pre class="screen" id="EN-US_TOPIC_0000001098974516__sa2dd8dda39884fdcbcf342448633ba5c"><strong id="EN-US_TOPIC_0000001098974516__a7105bd19859448aebe664f4f4d384e17">chown -R</strong> gds_user:gdsgrp<em id="EN-US_TOPIC_0000001098974516__i7236112210411"> /output_data </em></pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__ld2c41791b1cc4c5285d48ae380ff7f23">Log in to the data server as user <strong id="EN-US_TOPIC_0000001098974516__b174161458102212">gds_user</strong> and start GDS.<div class="p" id="EN-US_TOPIC_0000001098974516__a75a988d7150e436889ed7f764dbf1abd">The GDS installation path is <strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0085032744_b3720293761817">/opt/bin/dws/gds</strong>. Exported data files are stored in <strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0085032744_b37202937618115">/output_data/</strong>. The IP address of the data server is <strong id="EN-US_TOPIC_0000001098974516__b1724421112496">192.168.0.90</strong>. The GDS listening port is <strong id="EN-US_TOPIC_0000001098974516__b544915165491">5000</strong>. GDS runs in daemon mode. The degree of parallelism is 2.<pre class="screen" id="EN-US_TOPIC_0000001098974516__s7791d200070a4add972a2c11db854842"><strong id="EN-US_TOPIC_0000001098974516__a4adf0ee1094847f6973e15b4671d8c38">/opt/bin/dws/gds/bin/gds -d</strong> <em id="EN-US_TOPIC_0000001098974516__a74fe58782ce64de6ba464ba97e89c031">/output_data</em> <strong id="EN-US_TOPIC_0000001098974516__a6d5d114574e8445baf0ec028b309b7a5">-p </strong><em id="EN-US_TOPIC_0000001098974516__a03907ea1abe547edbf73f11519d1c231">192.168.0.90:5000 </em><strong id="EN-US_TOPIC_0000001098974516__a5f3ebbf23e2a48c9806aa0aaa5500b1c">-H</strong><em id="EN-US_TOPIC_0000001098974516__a08b9dd71cece4481a347f3aac82d87d7"> </em>10.10.0.1/24<em id="EN-US_TOPIC_0000001098974516__aefb46bcca7bb4441b0ef08a4dbfb3d15"> </em><strong id="EN-US_TOPIC_0000001098974516__a1dc625a609e44f4d81aceb167afbfaf1">-D -t </strong>2<strong id="EN-US_TOPIC_0000001098974516__a89c3aeeb5a2c41a9a660beb5be09224f"> </strong></pre>
|
|
</div>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__ld521900340f64063a359f8ec145383a1">In <span id="EN-US_TOPIC_0000001098974516__text1893218442">GaussDB(DWS)</span>, create the foreign tables <strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0125634759_b894734124412">foreign_tpcds_reasons1</strong> and <strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0125634759_b842352706155915">foreign_tpcds_reasons2</strong> for receiving data from the data server.<ul id="EN-US_TOPIC_0000001098974516__u17b613cb33be435e95ada54073ca2da2"><li id="EN-US_TOPIC_0000001098974516__l52bbad736a0a4deb8c6239773fdedbda">Data export mode settings are as follows:<ul id="EN-US_TOPIC_0000001098974516__u7fae3b75f8114b309ee250164a951cf1"><li id="EN-US_TOPIC_0000001098974516__la07e3bacf69347538b2dd610ff1d8be9">The directory for storing exported files is <strong id="EN-US_TOPIC_0000001098974516__b1048213454494">/output_data/</strong> and the GDS listening port is <strong id="EN-US_TOPIC_0000001098974516__b114821445164913">5000</strong> when GDS is started. The directory created for storing exported files is <strong id="EN-US_TOPIC_0000001098974516__b6794205134913">/output_data/</strong>. Therefore, the <strong id="EN-US_TOPIC_0000001098974516__b143805914453">location</strong> parameter is set to <strong id="EN-US_TOPIC_0000001098974516__b438015919455">gsfs://192.168.0.90:5000/</strong>.</li></ul>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__ldb671c8309fa415b98373b1e59428c52">Data format parameter settings are as follows:<ul id="EN-US_TOPIC_0000001098974516__u2f25d570ce7e448b9b1a85642018db58"><li id="EN-US_TOPIC_0000001098974516__la94b8d957a114f1dbd04480e9d9ff947"><strong id="EN-US_TOPIC_0000001098974516__b1890487381">format</strong> is set to <strong id="EN-US_TOPIC_0000001098974516__b1665660071">CSV</strong>.</li><li id="EN-US_TOPIC_0000001098974516__l8bb0675b3eb04ba6bcfcac6acd7e8f67"><strong id="EN-US_TOPIC_0000001098974516__b1495047069">encoding</strong> is set to <strong id="EN-US_TOPIC_0000001098974516__b369067804">UTF-8</strong>.</li><li id="EN-US_TOPIC_0000001098974516__la29d3cb7e95e4054b88cf91194c2daf2"><strong id="EN-US_TOPIC_0000001098974516__b15485612103514">delimiter</strong> is set to <strong id="EN-US_TOPIC_0000001098974516__b048515126355">E'\x08'</strong>.</li><li id="EN-US_TOPIC_0000001098974516__l97fc4774a9a249819077ede9d22d2d06"><strong id="EN-US_TOPIC_0000001098974516__b68216142357">quote</strong> is set to <strong id="EN-US_TOPIC_0000001098974516__b18821014143519">E'\x1b'</strong>.</li><li id="EN-US_TOPIC_0000001098974516__lf3539c39860645c094bb66d2a33de0ab"><strong id="EN-US_TOPIC_0000001098974516__b669896293">null</strong> is set to an empty string without quotation marks.</li><li id="EN-US_TOPIC_0000001098974516__lcf191e63fc704080a882c8abeaf82f08"><strong id="EN-US_TOPIC_0000001098974516__b17451298503">escape</strong> defaults to the value of <strong id="EN-US_TOPIC_0000001098974516__b11457209115013">quote</strong>.</li><li id="EN-US_TOPIC_0000001098974516__l144b35da7b8d41e4ad28b7746e228054"><strong id="EN-US_TOPIC_0000001098974516__b146811268433">header</strong> is set to <strong id="EN-US_TOPIC_0000001098974516__b94687263435">false</strong>, indicating that the first row is identified as a data row in an exported file.</li></ul>
|
|
</li></ul>
|
|
<p id="EN-US_TOPIC_0000001098974516__a9812744e60b2445283a90189b2ebafce">Based on the preceding settings, the foreign table <strong id="EN-US_TOPIC_0000001098974516__b4338191235">foreign_tpcds_reasons1</strong> is created using the following statement:</p>
|
|
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001098974516__s31fdffdc7ad3444190f415b63e209abc"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
|
|
<span class="normal">2</span>
|
|
<span class="normal">3</span>
|
|
<span class="normal">4</span>
|
|
<span class="normal">5</span>
|
|
<span class="normal">6</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="k">FOREIGN</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">foreign_tpcds_reasons1</span><span class="w"></span>
|
|
<span class="p">(</span><span class="w"> </span>
|
|
<span class="w"> </span><span class="n">r_reason_sk</span><span class="w"> </span><span class="nb">integer</span><span class="w"> </span><span class="k">not</span><span class="w"> </span><span class="k">null</span><span class="p">,</span><span class="w"></span>
|
|
<span class="w"> </span><span class="n">r_reason_id</span><span class="w"> </span><span class="nb">char</span><span class="p">(</span><span class="mi">16</span><span class="p">)</span><span class="w"> </span><span class="k">not</span><span class="w"> </span><span class="k">null</span><span class="p">,</span><span class="w"></span>
|
|
<span class="w"> </span><span class="n">r_reason_desc</span><span class="w"> </span><span class="nb">char</span><span class="p">(</span><span class="mi">100</span><span class="p">)</span><span class="w"></span>
|
|
<span class="p">)</span><span class="w"> </span><span class="n">SERVER</span><span class="w"> </span><span class="n">gsmpp_server</span><span class="w"> </span><span class="k">OPTIONS</span><span class="w"> </span><span class="p">(</span><span class="k">LOCATION</span><span class="w"> </span><span class="s1">'gsfs://192.168.0.90:5000/'</span><span class="p">,</span><span class="w"> </span><span class="n">FORMAT</span><span class="w"> </span><span class="s1">'CSV'</span><span class="p">,</span><span class="k">ENCODING</span><span class="w"> </span><span class="s1">'utf8'</span><span class="p">,</span><span class="w"> </span><span class="k">DELIMITER</span><span class="w"> </span><span class="n">E</span><span class="s1">'\x08'</span><span class="p">,</span><span class="w"> </span><span class="n">QUOTE</span><span class="w"> </span><span class="n">E</span><span class="s1">'\x1b'</span><span class="p">,</span><span class="w"> </span><span class="k">NULL</span><span class="w"> </span><span class="s1">''</span><span class="p">)</span><span class="w"> </span><span class="k">WRITE</span><span class="w"> </span><span class="k">ONLY</span><span class="p">;</span><span class="w"></span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
<p id="EN-US_TOPIC_0000001098974516__a411aae6fd01b451fbb2564515664f138">Based on the preceding settings, the foreign table <strong id="EN-US_TOPIC_0000001098974516__b151361340164614">foreign_tpcds_reasons2</strong> is created using the following statement:</p>
|
|
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001098974516__sadae6271ec644734a09c29475da670e5"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
|
|
<span class="normal">2</span>
|
|
<span class="normal">3</span>
|
|
<span class="normal">4</span>
|
|
<span class="normal">5</span>
|
|
<span class="normal">6</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="k">FOREIGN</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">foreign_tpcds_reasons2</span><span class="w"></span>
|
|
<span class="p">(</span><span class="w"> </span>
|
|
<span class="w"> </span><span class="n">r_reason_sk</span><span class="w"> </span><span class="nb">integer</span><span class="w"> </span><span class="k">not</span><span class="w"> </span><span class="k">null</span><span class="p">,</span><span class="w"></span>
|
|
<span class="w"> </span><span class="n">r_reason_id</span><span class="w"> </span><span class="nb">char</span><span class="p">(</span><span class="mi">16</span><span class="p">)</span><span class="w"> </span><span class="k">not</span><span class="w"> </span><span class="k">null</span><span class="p">,</span><span class="w"></span>
|
|
<span class="w"> </span><span class="n">r_reason_desc</span><span class="w"> </span><span class="nb">char</span><span class="p">(</span><span class="mi">100</span><span class="p">)</span><span class="w"></span>
|
|
<span class="p">)</span><span class="w"> </span><span class="n">SERVER</span><span class="w"> </span><span class="n">gsmpp_server</span><span class="w"> </span><span class="k">OPTIONS</span><span class="w"> </span><span class="p">(</span><span class="k">LOCATION</span><span class="w"> </span><span class="s1">'gsfs://192.168.0.90:5000/'</span><span class="p">,</span><span class="w"> </span><span class="n">FORMAT</span><span class="w"> </span><span class="s1">'CSV'</span><span class="p">,</span><span class="w"> </span><span class="k">DELIMITER</span><span class="w"> </span><span class="n">E</span><span class="s1">'\x08'</span><span class="p">,</span><span class="w"> </span><span class="n">QUOTE</span><span class="w"> </span><span class="n">E</span><span class="s1">'\x1b'</span><span class="p">,</span><span class="w"> </span><span class="k">NULL</span><span class="w"> </span><span class="s1">''</span><span class="p">)</span><span class="w"> </span><span class="k">WRITE</span><span class="w"> </span><span class="k">ONLY</span><span class="p">;</span><span class="w"></span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__l060f64b7504940fc99ec63e143c7b546">In the database, export data from table <strong id="EN-US_TOPIC_0000001098974516__b1738319311478">reasons1</strong> through the foreign table <strong id="EN-US_TOPIC_0000001098974516__b102711951483">foreign_tpcds_reasons1</strong> and from table <strong id="EN-US_TOPIC_0000001098974516__b16803141414819">reasons2</strong> through the foreign table <strong id="EN-US_TOPIC_0000001098974516__b1267772454819">foreign_tpcds_reasons2</strong> to <strong id="EN-US_TOPIC_0000001098974516__en-us_topic_0058967650_b37202937618421">/output_data</strong>.<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001098974516__sea0f7f34561a48aca8450e52b26b6b2c"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">INSERT</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="n">foreign_tpcds_reasons1</span><span class="w"> </span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">tpcds</span><span class="p">.</span><span class="n">reason</span><span class="p">;</span><span class="w"></span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001098974516__s0455fa60cbf14f2ba7fa907c38031be3"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">INSERT</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="n">foreign_tpcds_reasons2</span><span class="w"> </span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">tpcds</span><span class="p">.</span><span class="n">reason</span><span class="p">;</span><span class="w"></span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__ldb2968d060f84be094f5c20b549cf5f3">After data export is complete, log in to the data server as user <strong id="EN-US_TOPIC_0000001098974516__b1088293854814">gds_user</strong> and stop GDS.<div class="p" id="EN-US_TOPIC_0000001098974516__a7483e2c69c094c52a3cf824f6ce55b19">The GDS process ID is <strong id="EN-US_TOPIC_0000001098974516__b1798518377518">128954</strong>.<pre class="screen" id="EN-US_TOPIC_0000001098974516__sa7aaeb365f45405aba5d06f7d088a9eb"><strong id="EN-US_TOPIC_0000001098974516__a7e9a068d4f4d49bfa88d8e14c8eb48aa">ps -ef|grep gds</strong>
|
|
gds_user <strong id="EN-US_TOPIC_0000001098974516__a6ac03fe3dc72477fbe96c3aaf95268b3">128954</strong> 1 0 15:03 ? 00:00:00 gds -d /output_data -p 192.168.0.90:5000 -D -t 2
|
|
gds_user 129003 118723 0 15:04 pts/0 00:00:00 grep gds
|
|
<strong id="EN-US_TOPIC_0000001098974516__a72a8bbc84511449ca6e8acddacaf4888">kill -9</strong> 128954</pre>
|
|
</div>
|
|
</li></ol>
|
|
</div>
|
|
<div class="section" id="EN-US_TOPIC_0000001098974516__section2099813620517"><h4 class="sectiontitle">Exporting Data Through a Pipe</h4><ol id="EN-US_TOPIC_0000001098974516__ol82421990535"><li id="EN-US_TOPIC_0000001098974516__li6244119145316"><span>Start GDS.</span><p><pre class="screen" id="EN-US_TOPIC_0000001098974516__screen135315351579">gds -d /***/gds_data/ -D -p 192.168.0.1:7789 -l /***/gds_log/aa.log -H 0/0 -t 10 -D</pre>
|
|
<p id="EN-US_TOPIC_0000001098974516__p1152131474119">If you need to set the timeout interval of a pipe, use the <strong id="EN-US_TOPIC_0000001098974516__b199035617457">--pipe-timeout</strong> parameter.</p>
|
|
</p></li><li id="EN-US_TOPIC_0000001098974516__li1390185824920"><span>Export data.</span><p><ol type="a" id="EN-US_TOPIC_0000001098974516__ol1817971012508"><li id="EN-US_TOPIC_0000001098974516__li131793107508">Log in to the database, create an internal table, and write data to the table.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen323918010593"><span id="EN-US_TOPIC_0000001098974516__text96541727153116"></span>CREATE TABLE test_pipe( id integer not null, sex text not null, name text ) ;
|
|
|
|
<span id="EN-US_TOPIC_0000001098974516__text8536123433112"></span>INSERT INTO test_pipe values(1,2,'11111111111111');
|
|
<span id="EN-US_TOPIC_0000001098974516__text10686536153110"></span>INSERT INTO test_pipe values(2,2,'11111111111111');
|
|
<span id="EN-US_TOPIC_0000001098974516__text11625203813318"></span>INSERT INTO test_pipe values(3,2,'11111111111111');
|
|
<span id="EN-US_TOPIC_0000001098974516__text628713409316"></span>INSERT INTO test_pipe values(4,2,'11111111111111');</pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__li17478201417506">Create a write-only foreign table.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen62601859102512"><span id="EN-US_TOPIC_0000001098974516__text1039115588345"></span>CREATE FOREIGN TABLE foreign_test_pipe_tw( id integer not null, age text not null, name text ) SERVER gsmpp_server OPTIONS (LOCATION 'gsfs://192.168.0.1:7789/', FORMAT 'text', DELIMITER ',', NULL '', EOL '0x0a' ,file_type 'pipe', auto_create_pipe 'false') WRITE ONLY;</pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__li47991418135213">Execute the export statement. The statement will be blocked.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen175603226012"><span id="EN-US_TOPIC_0000001098974516__text171922418353"></span>INSERT INTO foreign_test_pipe_tw select * from test_pipe; </pre>
|
|
</li></ol>
|
|
</p></li><li id="EN-US_TOPIC_0000001098974516__li149561245205218"><span>Export data through the GDS pipe.</span><p><ol type="a" id="EN-US_TOPIC_0000001098974516__ol15910111110109"><li id="EN-US_TOPIC_0000001098974516__li139111811131016">Log in to GDS and go to the GDS data directory.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen9788294017">cd /***/gds_data/ </pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__li51338155105">Create a pipe. If <strong id="EN-US_TOPIC_0000001098974516__b126253216462">auto_create_pipe</strong> is set to <strong id="EN-US_TOPIC_0000001098974516__b22624326464">true</strong>, skip this step.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen1514862454012">mkfifo postgres_public_foreign_test_pipe_tw.pipe </pre>
|
|
<div class="note" id="EN-US_TOPIC_0000001098974516__note1034813717381"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="EN-US_TOPIC_0000001098974516__p11348177103810">A pipe will be automatically cleared after an operation is complete. To perform another operation, create a pipe again.</p>
|
|
</div></div>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__li97952111103">Read data from the pipe and write it to a new file.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen1493183634012">cat postgres_public_foreign_test_pipe_tw.pipe > postgres_public_foreign_test_pipe_tw.txt</pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__li1624962831013">To compress the exported files, run the following command:<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen14320838125815">gzip -9 -c < postgres_public_foreign_test_pipe_tw.pipe > out.gz </pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__li2679153791019">To export the content from the pipe to the HDFS server, run the following command:<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen137382145512">cat postgres_public_foreign_test_pipe_tw.pipe | hdfs dfs -put - /user/hive/***/test_pipe.txt</pre>
|
|
</li></ol>
|
|
</p></li><li id="EN-US_TOPIC_0000001098974516__li19597125335218"><span>Verify the exported data.</span><p><ol type="a" id="EN-US_TOPIC_0000001098974516__ol986619499153"><li id="EN-US_TOPIC_0000001098974516__li686654941512">Check whether the exported file is correct.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen3187123512519">cat postgres_public_foreign_test_pipe_tw.txt
|
|
3,2,11111111111111
|
|
1,2,11111111111111
|
|
2,2,11111111111111
|
|
4,2,11111111111111</pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__li3258165316153">Check the compressed file.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen113614487516">vim out.gz
|
|
3,2,11111111111111
|
|
1,2,11111111111111
|
|
2,2,11111111111111
|
|
4,2,11111111111111</pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__li155211259181516">Check the data exported to the HDFS server.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen168718416619">hdfs dfs -cat /user/hive/***/test_pipe.txt
|
|
3,2,11111111111111
|
|
1,2,11111111111111
|
|
2,2,11111111111111
|
|
4,2,11111111111111</pre>
|
|
</li></ol>
|
|
</p></li></ol>
|
|
</div>
|
|
<div class="section" id="EN-US_TOPIC_0000001098974516__section162219436522"><h4 class="sectiontitle">Exporting Data Through Multi-Process Pipes</h4><p id="EN-US_TOPIC_0000001098974516__p842218541105">GDS also supports importing and exporting data through multi-process pipes. That is, one foreign table corresponds to multiple GDSs.</p>
|
|
<p id="EN-US_TOPIC_0000001098974516__p379766184215">The following takes exporting a local file as an example.</p>
|
|
<ol id="EN-US_TOPIC_0000001098974516__ol20408174771311"><li id="EN-US_TOPIC_0000001098974516__li9408104721314"><span>Start multiple GDSs.</span><p><pre class="screen" id="EN-US_TOPIC_0000001098974516__screen9565938191215">gds -d /***/gds_data/ -D -p 192.168.0.1:7789 -l /***/gds_log/aa.log -H 0/0 -t 10 -D
|
|
gds -d /***/gds_data_1/ -D -p 192.168.0.1:7790 -l /***/gds_log/aa.log -H 0/0 -t 10 -D</pre>
|
|
<p id="EN-US_TOPIC_0000001098974516__p133521190431">If you need to set the timeout interval of a pipe, use the <strong id="EN-US_TOPIC_0000001098974516__b474918556478">--pipe-timeout</strong> parameter.</p>
|
|
</p></li><li id="EN-US_TOPIC_0000001098974516__li1725917883612"><span>Export data.</span><p><ol type="a" id="EN-US_TOPIC_0000001098974516__ol412981103614"><li id="EN-US_TOPIC_0000001098974516__li141296112362">Log in to the database and create an internal table.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen1332163311134"><span id="EN-US_TOPIC_0000001098974516__text12371194717351"></span>CREATE TABLE test_pipe (id integer not null, sex text not null, name text);</pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__li1884141515444">Write data.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen1987415714411"><span id="EN-US_TOPIC_0000001098974516__text884134913357"></span>INSERT INTO test_pipe values(1,2,'11111111111111');
|
|
<span id="EN-US_TOPIC_0000001098974516__text6652651153510"></span>INSERT INTO test_pipe values(2,2,'11111111111111');
|
|
<span id="EN-US_TOPIC_0000001098974516__text163641253163519"></span>INSERT INTO test_pipe values(3,2,'11111111111111');
|
|
<span id="EN-US_TOPIC_0000001098974516__text4339955193510"></span>INSERT INTO test_pipe values(4,2,'11111111111111');</pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__li13149191411361">Create a write-only foreign table.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen393885161317"><span id="EN-US_TOPIC_0000001098974516__text299475723517"></span>CREATE FOREIGN TABLE foreign_test_pipe_tw( id integer not null, age text not null, name text ) SERVER gsmpp_server OPTIONS (LOCATION 'gsfs://192.168.0.1:7789/|gsfs://192.168.0.1:7790/', FORMAT 'text', DELIMITER ',', NULL '', EOL '0x0a' ,file_type 'pipe', auto_create_pipe 'false') WRITE ONLY;</pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__li20505220153614">Execute the export statement. The statement will be blocked.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen123719184149"><span id="EN-US_TOPIC_0000001098974516__text670511023610"></span>INSERT INTO foreign_test_pipe_tw select * from test_pipe; </pre>
|
|
</li></ol>
|
|
</p></li><li id="EN-US_TOPIC_0000001098974516__li14257155511383"><span>Export data through the GDS pipes.</span><p><ol type="a" id="EN-US_TOPIC_0000001098974516__ol199213263914"><li id="EN-US_TOPIC_0000001098974516__li945575119395">Log in to GDS and go to each GDS data directory.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen465135973914">cd /***/gds_data/
|
|
cd /***/gds_data_1/ </pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__li12492811203920">Create a pipe. If <strong id="EN-US_TOPIC_0000001098974516__b239212105485">auto_create_pipe</strong> is set to <strong id="EN-US_TOPIC_0000001098974516__b17392910164819">true</strong>, skip this step.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen438619016152">mkfifo postgres_public_foreign_test_pipe_tw.pipe </pre>
|
|
</li><li id="EN-US_TOPIC_0000001098974516__li95801819203911">Read each pipe and write the new file to the pipes.<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen5662812151">cat postgres_public_foreign_test_pipe_tw.pipe > postgres_public_foreign_test_pipe_tw.txt</pre>
|
|
</li></ol>
|
|
</p></li><li id="EN-US_TOPIC_0000001098974516__li1724519137161"><span>Verify the exported data.</span><p><pre class="screen" id="EN-US_TOPIC_0000001098974516__screen853063471517">cat /***/gds_data/postgres_public_foreign_test_pipe_tw.txt
|
|
3,2,11111111111111</pre>
|
|
<pre class="screen" id="EN-US_TOPIC_0000001098974516__screen219884691516">cat /***/gds_data_1/postgres_public_foreign_test_pipe_tw.txt
|
|
1,2,11111111111111
|
|
2,2,11111111111111
|
|
4,2,11111111111111</pre>
|
|
</p></li></ol>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dws_04_0261.html">Using GDS to Export Data to a Remote Server</a></div>
|
|
</div>
|
|
</div>
|
|
|