forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Su, Xiaomeng <suxiaomeng1@huawei.com> Co-committed-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
196 lines
24 KiB
HTML
196 lines
24 KiB
HTML
<a name="dli_08_0267"></a><a name="dli_08_0267"></a>
|
|
|
|
<h1 class="topictitle1">File System Sink Stream (Recommended)</h1>
|
|
<div id="body1589783969791"><div class="section" id="dli_08_0267__section204511548185514"><h4 class="sectiontitle">Function</h4><p id="dli_08_0267__p251281115013">You can create a sink stream to export data to a file system such as HDFS or OBS. After the data is generated, a non-DLI table can be created directly according to the generated directory. The table can be processed through DLI SQL, and the output data directory can be stored in partitioned tables. It is applicable to scenarios such as data dumping, big data analysis, data backup, and active, deep, or cold archiving.</p>
|
|
<p id="dli_08_0267__p132361548155310">OBS is an object-based storage service. It provides massive, secure, highly reliable, and low-cost data storage capabilities.</p>
|
|
</div>
|
|
<div class="section" id="dli_08_0267__section17191277562"><h4 class="sectiontitle">Syntax</h4><div class="codecoloring" codetype="Sql" id="dli_08_0267__screen18441237155620"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
|
|
<span class="normal">2</span>
|
|
<span class="normal">3</span>
|
|
<span class="normal">4</span>
|
|
<span class="normal">5</span>
|
|
<span class="normal">6</span>
|
|
<span class="normal">7</span>
|
|
<span class="normal">8</span>
|
|
<span class="normal">9</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="n">SINK</span><span class="w"> </span><span class="n">STREAM</span><span class="w"> </span><span class="n">stream_id</span><span class="w"> </span><span class="p">(</span><span class="n">attr_name</span><span class="w"> </span><span class="n">attr_type</span><span class="w"> </span><span class="p">(</span><span class="s1">','</span><span class="w"> </span><span class="n">attr_name</span><span class="w"> </span><span class="n">attr_type</span><span class="p">)</span><span class="o">*</span><span class="w"> </span><span class="p">)</span>
|
|
<span class="w"> </span><span class="p">[</span><span class="n">PARTITIONED</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="p">(</span><span class="n">attr_name</span><span class="w"> </span><span class="p">(</span><span class="s1">','</span><span class="w"> </span><span class="n">attr_name</span><span class="p">)</span><span class="o">*</span><span class="p">]</span>
|
|
<span class="w"> </span><span class="k">WITH</span><span class="w"> </span><span class="p">(</span>
|
|
<span class="w"> </span><span class="k">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"filesystem"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">file</span><span class="p">.</span><span class="n">path</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"obs://bucket/xx"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">encode</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"parquet"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">ak</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">""</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">sk</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">""</span>
|
|
<span class="w"> </span><span class="p">);</span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
</div>
|
|
<div class="section" id="dli_08_0267__section4299113491"><h4 class="sectiontitle">Keywords</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="dli_08_0267__en-us_topic_0132788972_table6208194110214" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Keywords</caption><thead align="left"><tr id="dli_08_0267__en-us_topic_0132788972_row32081441727"><th align="left" class="cellrowborder" valign="top" width="11.717171717171718%" id="mcps1.3.3.2.2.4.1.1"><p id="dli_08_0267__en-us_topic_0132788972_p192085418212">Parameter</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="8.191919191919192%" id="mcps1.3.3.2.2.4.1.2"><p id="dli_08_0267__en-us_topic_0132788972_p108302338272">Mandatory</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="80.09090909090911%" id="mcps1.3.3.2.2.4.1.3"><p id="dli_08_0267__en-us_topic_0132788972_p16208841429">Description</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="dli_08_0267__en-us_topic_0132788972_row520834110214"><td class="cellrowborder" valign="top" width="11.717171717171718%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0267__en-us_topic_0132788972_p1220834118219">type</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="8.191919191919192%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0267__en-us_topic_0132788972_p1983083310271">Yes</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="80.09090909090911%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0267__en-us_topic_0132788972_p1820811418218">Output stream type. If <span class="parmname" id="dli_08_0267__parmname17214438171913"><b>type</b></span> is set to <span class="parmvalue" id="dli_08_0267__parmvalue7670451198"><b>filesystem</b></span>, data is exported to the file system.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0267__en-us_topic_0132788972_row202081411215"><td class="cellrowborder" valign="top" width="11.717171717171718%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0267__en-us_topic_0132788972_p52089411223">file.path</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="8.191919191919192%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0267__en-us_topic_0132788972_p18830133362717">Yes</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="80.09090909090911%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0267__p181101557644">Output directory in the form: <strong id="dli_08_0267__b12723225455">schema://file.path</strong>.</p>
|
|
<p id="dli_08_0267__en-us_topic_0132788972_p12081241524">Currently, Schema supports only OBS and HDFS.</p>
|
|
<ul id="dli_08_0267__dli_08_0241_en-us_topic_0111501792_ul33631284367"><li id="dli_08_0267__dli_08_0241_en-us_topic_0111501792_li6364162893612">If <strong id="dli_08_0267__b147956715204">schema</strong> is set to <strong id="dli_08_0267__b8162312207">obs</strong>, data is stored to OBS.</li><li id="dli_08_0267__li059214393457">If <strong id="dli_08_0267__b1350416227208">schema</strong> is set to <strong id="dli_08_0267__b203503279207">hdfs</strong>, data is exported to HDFS. A proxy user needs to be configured for HDFS. For details, see <a href="#dli_08_0267__section11762174112291">HDFS Proxy User Configuration</a>.<p id="dli_08_0267__p1643564612456">Example: <strong id="dli_08_0267__b1372915302011">hdfs://node-master1sYAx:9820/user/car_infos</strong>, where <strong id="dli_08_0267__b1390061162116">node-master1sYAx:9820</strong> is the name of the node where the NameNode is located.</p>
|
|
</li></ul>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0267__en-us_topic_0132788972_row62081841522"><td class="cellrowborder" valign="top" width="11.717171717171718%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0267__en-us_topic_0132788972_p620816418220">encode</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="8.191919191919192%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0267__en-us_topic_0132788972_p1830103315276">Yes</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="80.09090909090911%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0267__en-us_topic_0132788972_p7209441226">Output data encoding format. Currently, only the <span class="parmvalue" id="dli_08_0267__parmvalue1197615478489"><b>parquet</b></span> and <span class="parmvalue" id="dli_08_0267__parmvalue1037842102118"><b>csv</b></span> formats are supported.</p>
|
|
<ul id="dli_08_0267__ul1569219263346"><li id="dli_08_0267__li5722104593410">When <strong id="dli_08_0267__b364631642215">schema</strong> is set to <strong id="dli_08_0267__b284761062210">obs</strong>, the encoding format of the output data can only be <span class="parmvalue" id="dli_08_0267__parmvalue122614642216"><b>parquet</b></span>.</li><li id="dli_08_0267__li41341118351">When <strong id="dli_08_0267__b4543633192210">schema</strong> is set to <strong id="dli_08_0267__b1889083502215">hdfs</strong>, the output data can be encoded in <span class="parmvalue" id="dli_08_0267__parmvalue14944453162217"><b>Parquet</b></span> or <span class="parmvalue" id="dli_08_0267__parmvalue114516574229"><b>CSV</b></span> format.</li></ul>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0267__en-us_topic_0132788972_row31475311352"><td class="cellrowborder" valign="top" width="11.717171717171718%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0267__en-us_topic_0132788972_p141471432358">ak</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="8.191919191919192%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0267__en-us_topic_0132788972_p0830143312271">No</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="80.09090909090911%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0267__p1648626115511">Access key. This parameter is mandatory when data is exported to OBS. Global variables can be used to mask the access key used for OBS authentication.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0267__en-us_topic_0132788972_row11643425191314"><td class="cellrowborder" valign="top" width="11.717171717171718%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0267__en-us_topic_0132788972_p870925212283">sk</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="8.191919191919192%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0267__en-us_topic_0132788972_p170985214289">No</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="80.09090909090911%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0267__en-us_topic_0132788972_p12709115262819">Secret access key. This parameter is mandatory when data is exported to OBS. Secret key for accessing OBS authentication. Global variables can be used to mask sensitive information.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0267__row2865834518"><td class="cellrowborder" valign="top" width="11.717171717171718%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0267__p28175804517">krb_auth</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="8.191919191919192%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0267__p18165812455">No</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="80.09090909090911%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0267__dli_08_0254_p198893535164">Authentication name for creating a datasource connection authentication. This parameter is mandatory when Kerberos authentication is enabled. If Kerberos authentication is not enabled for the created MRS cluster, ensure that the <span class="filepath" id="dli_08_0267__filepath3350165318235"><b>/etc/hosts</b></span> information of the master node in the MRS cluster is added to the host file of the DLI queue.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0267__row2594112819436"><td class="cellrowborder" valign="top" width="11.717171717171718%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0267__p1659542819431">field_delimiter</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="8.191919191919192%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0267__p16595152814312">No</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="80.09090909090911%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0267__dli_08_0241_en-us_topic_0111501792_p1236115284365">Separator used to separate every two attributes.</p>
|
|
<p id="dli_08_0267__p970724135820">This parameter needs to be configured if the CSV encoding format is adopted. It can be user-defined, for example, a comma (<span class="parmvalue" id="dli_08_0267__parmvalue27410203249"><b>,</b></span>).</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="dli_08_0267__section9685114274916"><h4 class="sectiontitle">Precautions</h4><ul id="dli_08_0267__ul11975143211011"><li id="dli_08_0267__li1197515321409">To ensure job consistency, enable checkpointing if the Flink job uses the file system output stream.</li><li id="dli_08_0267__li043725484619">To avoid data loss or data coverage, you need to enable automatic or manual restart upon job exceptions. Enable the <span class="parmvalue" id="dli_08_0267__parmvalue4107131573"><b>Restore Job from Checkpoint</b></span>.</li><li id="dli_08_0267__li119763321308">Set the checkpoint interval after weighing between real-time output file, file size, and recovery time, such as 10 minutes.</li><li id="dli_08_0267__li1322612105204">Two modes are supported.<ul id="dli_08_0267__ul1022611017200"><li id="dli_08_0267__li1722611105208"><strong id="dli_08_0267__b12668195018129">At least once</strong>: Events are processed at least once.</li><li id="dli_08_0267__li02269102205"><strong id="dli_08_0267__b1791994713128">Exactly once</strong>: Events are processed only once.</li></ul>
|
|
</li><li id="dli_08_0267__li16872110151417">When you use sink streams of a file system to write data into OBS, do not use multiple jobs for the same directory.<ul id="dli_08_0267__ul14493922131411"><li id="dli_08_0267__li8180113681212">The default behavior of an OBS bucket is overwriting, which may cause data loss.</li><li id="dli_08_0267__li1443871510144">The default behavior of the OBS parallel file system bucket is appending, which may cause data confusion.</li></ul>
|
|
<p id="dli_08_0267__p0269152719148">You should carefully select the OBS bucket because of the preceding behavior differences. Data exceptions may occur after abnormal job restart.</p>
|
|
</li></ul>
|
|
</div>
|
|
<div class="section" id="dli_08_0267__section11762174112291"><a name="dli_08_0267__section11762174112291"></a><a name="section11762174112291"></a><h4 class="sectiontitle">HDFS Proxy User Configuration</h4><ol id="dli_08_0267__ol19981205610511"><li id="dli_08_0267__li12981856959">Log in to the MRS management page.</li><li id="dli_08_0267__li1398185612513">Select the HDFS NameNode configuration of MRS and add configuration parameters in the <strong id="dli_08_0267__b795932142518">Customization</strong> area.<p id="dli_08_0267__p203271141977">In the preceding information, <strong id="dli_08_0267__b9404151192513">myname</strong> in the <strong id="dli_08_0267__b840664718268">core-site</strong> values <span class="parmvalue" id="dli_08_0267__parmvalue121441619264"><b>hadoop.proxyuser.myname.hosts</b></span> and <span class="parmvalue" id="dli_08_0267__parmvalue1160161452614"><b>hadoop.proxyuser.myname.groups</b></span> is the name of the krb authentication user.</p>
|
|
<div class="note" id="dli_08_0267__note1116365616225"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="dli_08_0267__p8172256182220">Ensure that the permission on the HDFS data write path is <strong id="dli_08_0267__b18352513112713">777</strong>.</p>
|
|
</div></div>
|
|
</li><li id="dli_08_0267__li15537224142113">After the configuration is complete, click <span class="uicontrol" id="dli_08_0267__uicontrol358122514279"><b>Save</b></span>.</li></ol>
|
|
</div>
|
|
<div class="section" id="dli_08_0267__section0980251175613"><h4 class="sectiontitle">Example</h4><ul id="dli_08_0267__ul1973134412246"><li id="dli_08_0267__li137311744192412">Example 1:<p id="dli_08_0267__p25191546121310"><a name="dli_08_0267__li137311744192412"></a><a name="li137311744192412"></a>The following example dumps the <strong id="dli_08_0267__b649865155613">car_info</strong> data to OBS, with the <strong id="dli_08_0267__b195041451165616">buyday</strong> field as the partition field and <strong id="dli_08_0267__b1150455125619">parquet</strong> as the encoding format.</p>
|
|
<div class="codecoloring" codetype="Sql" id="dli_08_0267__screen2092424014338"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span>
|
|
<span class="normal"> 2</span>
|
|
<span class="normal"> 3</span>
|
|
<span class="normal"> 4</span>
|
|
<span class="normal"> 5</span>
|
|
<span class="normal"> 6</span>
|
|
<span class="normal"> 7</span>
|
|
<span class="normal"> 8</span>
|
|
<span class="normal"> 9</span>
|
|
<span class="normal">10</span>
|
|
<span class="normal">11</span>
|
|
<span class="normal">12</span>
|
|
<span class="normal">13</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">create</span><span class="w"> </span><span class="n">sink</span><span class="w"> </span><span class="n">stream</span><span class="w"> </span><span class="n">car_infos</span><span class="w"> </span><span class="p">(</span>
|
|
<span class="w"> </span><span class="n">carId</span><span class="w"> </span><span class="n">string</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">carOwner</span><span class="w"> </span><span class="n">string</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">average_speed</span><span class="w"> </span><span class="n">double</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">buyday</span><span class="w"> </span><span class="n">string</span>
|
|
<span class="w"> </span><span class="p">)</span><span class="w"> </span><span class="n">partitioned</span><span class="w"> </span><span class="k">by</span><span class="w"> </span><span class="p">(</span><span class="n">buyday</span><span class="p">)</span>
|
|
<span class="w"> </span><span class="k">with</span><span class="w"> </span><span class="p">(</span>
|
|
<span class="w"> </span><span class="k">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"filesystem"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">file</span><span class="p">.</span><span class="n">path</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"obs://obs-sink/car_infos"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">encode</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"parquet"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">ak</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"{{myAk}}"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">sk</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"{{mySk}}"</span>
|
|
<span class="p">);</span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
<p id="dli_08_0267__p11128121811318">The data is ultimately stored in OBS. Directory: <strong id="dli_08_0267__b455318118574">obs://obs-sink/car_infos/buyday=xx/part-x-x</strong>.</p>
|
|
<p id="dli_08_0267__p61281918111319">After the data is generated, the OBS partitioned table can be established for subsequent batch processing through the following SQL statements:</p>
|
|
<ol id="dli_08_0267__ol1017919178"><li id="dli_08_0267__li121781131717">Create an OBS partitioned table. <div class="codecoloring" codetype="Sql" id="dli_08_0267__screen127605179171"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
|
|
<span class="normal">2</span>
|
|
<span class="normal">3</span>
|
|
<span class="normal">4</span>
|
|
<span class="normal">5</span>
|
|
<span class="normal">6</span>
|
|
<span class="normal">7</span>
|
|
<span class="normal">8</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">create</span><span class="w"> </span><span class="k">table</span><span class="w"> </span><span class="n">car_infos</span><span class="w"> </span><span class="p">(</span>
|
|
<span class="w"> </span><span class="n">carId</span><span class="w"> </span><span class="n">string</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">carOwner</span><span class="w"> </span><span class="n">string</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">average_speed</span><span class="w"> </span><span class="n">double</span>
|
|
<span class="p">)</span>
|
|
<span class="w"> </span><span class="n">partitioned</span><span class="w"> </span><span class="k">by</span><span class="w"> </span><span class="p">(</span><span class="n">buyday</span><span class="w"> </span><span class="n">string</span><span class="p">)</span>
|
|
<span class="w"> </span><span class="n">stored</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">parquet</span>
|
|
<span class="w"> </span><span class="k">location</span><span class="w"> </span><span class="s1">'obs://obs-sink/car_infos'</span><span class="p">;</span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
</li><li id="dli_08_0267__li1185684751719">Restore partition information from the associated OBS path.<div class="codecoloring" codetype="Sql" id="dli_08_0267__screen165164785020"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">alter</span><span class="w"> </span><span class="k">table</span><span class="w"> </span><span class="n">car_infos</span><span class="w"> </span><span class="n">recover</span><span class="w"> </span><span class="n">partitions</span><span class="p">;</span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
</li></ol>
|
|
</li><li id="dli_08_0267__li17823303259">Example 2:<p id="dli_08_0267__p2095630172511"><a name="dli_08_0267__li17823303259"></a><a name="li17823303259"></a>The following example dumps the <strong id="dli_08_0267__b17978136135713">car_info</strong> data to HDFS, with the <strong id="dli_08_0267__b139841664576">buyday</strong> field as the partition field and <strong id="dli_08_0267__b1984126145710">csv</strong> as the encoding format.</p>
|
|
<div class="codecoloring" codetype="Sql" id="dli_08_0267__screen995330202511"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span>
|
|
<span class="normal"> 2</span>
|
|
<span class="normal"> 3</span>
|
|
<span class="normal"> 4</span>
|
|
<span class="normal"> 5</span>
|
|
<span class="normal"> 6</span>
|
|
<span class="normal"> 7</span>
|
|
<span class="normal"> 8</span>
|
|
<span class="normal"> 9</span>
|
|
<span class="normal">10</span>
|
|
<span class="normal">11</span>
|
|
<span class="normal">12</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">create</span><span class="w"> </span><span class="n">sink</span><span class="w"> </span><span class="n">stream</span><span class="w"> </span><span class="n">car_infos</span><span class="w"> </span><span class="p">(</span>
|
|
<span class="w"> </span><span class="n">carId</span><span class="w"> </span><span class="n">string</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">carOwner</span><span class="w"> </span><span class="n">string</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">average_speed</span><span class="w"> </span><span class="n">double</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">buyday</span><span class="w"> </span><span class="n">string</span>
|
|
<span class="w"> </span><span class="p">)</span><span class="w"> </span><span class="n">partitioned</span><span class="w"> </span><span class="k">by</span><span class="w"> </span><span class="p">(</span><span class="n">buyday</span><span class="p">)</span>
|
|
<span class="w"> </span><span class="k">with</span><span class="w"> </span><span class="p">(</span>
|
|
<span class="w"> </span><span class="k">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"filesystem"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">file</span><span class="p">.</span><span class="n">path</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"hdfs://node-master1sYAx:9820/user/car_infos"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">encode</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"csv"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">field_delimiter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">","</span>
|
|
<span class="p">);</span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
<p id="dli_08_0267__p199583052516">The data is ultimately stored in HDFS. Directory: <strong id="dli_08_0267__b81712119579">/user/car_infos/buyday=xx/part-x-x</strong>.</p>
|
|
</li></ul>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dli_08_0240.html">Creating a Sink Stream</a></div>
|
|
</div>
|
|
</div>
|
|
|