forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Su, Xiaomeng <suxiaomeng1@huawei.com> Co-committed-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
182 lines
18 KiB
HTML
182 lines
18 KiB
HTML
<a name="dli_08_0236"></a><a name="dli_08_0236"></a>
|
|
|
|
<h1 class="topictitle1">OBS Source Stream</h1>
|
|
<div id="body1574393066552"><div class="section" id="dli_08_0236__en-us_topic_0111499972_section1826712033217"><h4 class="sectiontitle">Function</h4><p id="dli_08_0236__en-us_topic_0111499972_p54453695817">Create a source stream to obtain data from OBS. DLI reads data stored by users in OBS as input data for jobs. OBS applies to various scenarios, such as big data analysis, cloud-native application program data, static website hosting, backup/active archive, and deep/cold archive.</p>
|
|
<p id="dli_08_0236__en-us_topic_0111499972_p316664317163">OBS is an object-based storage service. It provides massive, secure, highly reliable, and low-cost data storage capabilities. For more information about OBS, see the <em id="dli_08_0236__i13824594127">Object Storage Service Console Operation Guide</em>.</p>
|
|
</div>
|
|
<div class="section" id="dli_08_0236__en-us_topic_0111499972_section12289142974811"><h4 class="sectiontitle">Syntax</h4><div class="codecoloring" codetype="Sql" id="dli_08_0236__screen29731240713"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span>
|
|
<span class="normal"> 2</span>
|
|
<span class="normal"> 3</span>
|
|
<span class="normal"> 4</span>
|
|
<span class="normal"> 5</span>
|
|
<span class="normal"> 6</span>
|
|
<span class="normal"> 7</span>
|
|
<span class="normal"> 8</span>
|
|
<span class="normal"> 9</span>
|
|
<span class="normal">10</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="k">SOURCE</span><span class="w"> </span><span class="n">STREAM</span><span class="w"> </span><span class="n">stream_id</span><span class="w"> </span><span class="p">(</span><span class="n">attr_name</span><span class="w"> </span><span class="n">attr_type</span><span class="w"> </span><span class="p">(</span><span class="s1">','</span><span class="w"> </span><span class="n">attr_name</span><span class="w"> </span><span class="n">attr_type</span><span class="p">)</span><span class="o">*</span><span class="w"> </span><span class="p">)</span>
|
|
<span class="w"> </span><span class="k">WITH</span><span class="w"> </span><span class="p">(</span>
|
|
<span class="w"> </span><span class="k">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"obs"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">region</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">""</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">bucket</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">""</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">object_name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">""</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">row_delimiter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"\n"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">field_delimiter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">''</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">version_id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">""</span>
|
|
<span class="w"> </span><span class="p">);</span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
</div>
|
|
<div class="section" id="dli_08_0236__section3460151113206"><h4 class="sectiontitle">Keywords</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="dli_08_0236__en-us_topic_0111499972_table1413253055920" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Keywords</caption><thead align="left"><tr id="dli_08_0236__en-us_topic_0111499972_row121331730155920"><th align="left" class="cellrowborder" valign="top" width="16.220000000000002%" id="mcps1.3.3.2.2.4.1.1"><p id="dli_08_0236__en-us_topic_0111499972_p1197243216118">Parameter</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="10.11%" id="mcps1.3.3.2.2.4.1.2"><p id="dli_08_0236__en-us_topic_0111499972_p197982468486">Mandatory</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="73.67%" id="mcps1.3.3.2.2.4.1.3"><p id="dli_08_0236__en-us_topic_0111499972_p197215321617">Description</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="dli_08_0236__en-us_topic_0111499972_row131331530185911"><td class="cellrowborder" valign="top" width="16.220000000000002%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0236__en-us_topic_0111499972_p6133133075914">type</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="10.11%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0236__en-us_topic_0111499972_p1679844612486">Yes</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.67%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0236__en-us_topic_0111499972_p9133123016590">Data source type. <span class="parmvalue" id="dli_08_0236__en-us_topic_0111499972_parmvalue17630141819505"><b>obs</b></span> indicates that the data source is OBS.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0236__en-us_topic_0111499972_row5133163010597"><td class="cellrowborder" valign="top" width="16.220000000000002%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0236__en-us_topic_0111499972_p16133183085915">region</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="10.11%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0236__en-us_topic_0111499972_p17989462487">Yes</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.67%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0236__en-us_topic_0111499972_p11134730115915">Region to which OBS belongs.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0236__row1840211211197"><td class="cellrowborder" valign="top" width="16.220000000000002%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0236__p174021821590">encode</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="10.11%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0236__p1440342114916">No</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.67%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0236__p134031221192">Data encoding format. The value can be <strong id="dli_08_0236__b108426355497">csv</strong> or <strong id="dli_08_0236__b1755816376499">json</strong>. The default value is <strong id="dli_08_0236__b98263994917">csv</strong>.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0236__row116911252123912"><td class="cellrowborder" valign="top" width="16.220000000000002%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0236__p111537416387">ak</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="10.11%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0236__p161531941193814">No</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.67%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0236__p141531741143816">Access Key ID (AK). </p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0236__row158911848183912"><td class="cellrowborder" valign="top" width="16.220000000000002%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0236__p58994360384">sk</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="10.11%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0236__p20899163615388">No</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.67%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0236__p1989953683812">Secret access key used together with the ID of the access key. </p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0236__en-us_topic_0111499972_row1613423019594"><td class="cellrowborder" valign="top" width="16.220000000000002%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0236__en-us_topic_0111499972_p1513412308596">bucket</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="10.11%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0236__en-us_topic_0111499972_p1479814460485">Yes</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.67%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0236__en-us_topic_0111499972_p1113419301597">Name of the OBS bucket where data is located.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0236__en-us_topic_0111499972_row613411309593"><td class="cellrowborder" valign="top" width="16.220000000000002%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0236__en-us_topic_0111499972_p13134143095918">object_name</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="10.11%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0236__en-us_topic_0111499972_p1798164624815">Yes</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.67%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0236__en-us_topic_0111499972_p1813473085918">Name of the object stored in the OBS bucket where data is located. If the object is not in the OBS root directory, you need to specify the folder name, for example, <strong id="dli_08_0236__b1821141324713">test/test.csv</strong>. For the object file format, see the <strong id="dli_08_0236__b7158145734915">encode</strong> parameter.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0236__en-us_topic_0111499972_row1813483017591"><td class="cellrowborder" valign="top" width="16.220000000000002%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0236__en-us_topic_0111499972_p013443005912">row_delimiter</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="10.11%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0236__en-us_topic_0111499972_p0799546134819">Yes</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.67%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0236__en-us_topic_0111499972_p11341330205912">Separator used to separate every two rows.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0236__en-us_topic_0111499972_row171341830155912"><td class="cellrowborder" valign="top" width="16.220000000000002%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0236__en-us_topic_0111499972_p191346301591">field_delimiter</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="10.11%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0236__en-us_topic_0111499972_p1879924612485">No</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.67%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0236__en-us_topic_0111499972_p1013453013593">Separator used to separate every two attributes.</p>
|
|
<ul id="dli_08_0236__ul139595215139"><li id="dli_08_0236__li18395652131311">This parameter is mandatory when <strong id="dli_08_0236__b1760819245218">encode</strong> is <strong id="dli_08_0236__b7410184195212">csv</strong>. You use custom attribute separators.</li><li id="dli_08_0236__li9850195851317">If <strong id="dli_08_0236__b19483461521">encode</strong> is <strong id="dli_08_0236__b494754810521">json</strong>, you do not need to set this parameter.</li></ul>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0236__en-us_topic_0111499972_row965315310371"><td class="cellrowborder" valign="top" width="16.220000000000002%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0236__en-us_topic_0111499972_p14728918132620">quote</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="10.11%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0236__en-us_topic_0111499972_p8730131818266">No</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.67%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0236__en-us_topic_0111499972_p397121416310">Quoted symbol in a data format. The attribute delimiters between two quoted symbols are treated as common characters.</p>
|
|
<ul id="dli_08_0236__en-us_topic_0111499972_ul18631537631"><li id="dli_08_0236__en-us_topic_0111499972_li10631173715314">If double quotation marks are used as the quoted symbol, set this parameter to <strong id="dli_08_0236__b6876151387">\u005c\u0022</strong> for character conversion.</li><li id="dli_08_0236__en-us_topic_0111499972_li1963112378318">If a single quotation mark is used as the quoted symbol, set this parameter to a single quotation mark (').</li></ul>
|
|
<div class="note" id="dli_08_0236__en-us_topic_0111499972_note1377013361642"><span class="notetitle"> NOTE: </span><div class="notebody"><ul id="dli_08_0236__ul10869184631910"><li id="dli_08_0236__li13869104610196">Currently, only the CSV format is supported.</li><li id="dli_08_0236__li98691469198">After this parameter is specified, ensure that each field does not contain quoted symbols or contains an even number of quoted symbols. Otherwise, parsing will fail.</li></ul>
|
|
</div></div>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_0236__en-us_topic_0111499972_row71341630165914"><td class="cellrowborder" valign="top" width="16.220000000000002%" headers="mcps1.3.3.2.2.4.1.1 "><p id="dli_08_0236__en-us_topic_0111499972_p16135193045918">version_id</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="10.11%" headers="mcps1.3.3.2.2.4.1.2 "><p id="dli_08_0236__en-us_topic_0111499972_p2799184664816">No</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.67%" headers="mcps1.3.3.2.2.4.1.3 "><p id="dli_08_0236__en-us_topic_0111499972_p8135113014596">Version number. This parameter is optional and required only when the OBS bucket or object has version settings.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="dli_08_0236__section159654482015"><h4 class="sectiontitle">Precautions</h4><p id="dli_08_0236__p11761959195610">When creating a source stream, you can specify a time model for subsequent calculation. Currently, DLI supports two time models: Processing Time and Event Time. For details about the syntax, see <a href="dli_08_0107.html">Configuring Time Models</a>.</p>
|
|
</div>
|
|
<div class="section" id="dli_08_0236__section18363191082110"><h4 class="sectiontitle">Example</h4><ul id="dli_08_0236__ul155146132167"><li id="dli_08_0236__li135151613111615">The <strong id="dli_08_0236__en-us_topic_0111499972_b842352706192812">input.csv</strong> file is read from the OBS bucket. Rows are separated by <strong id="dli_08_0236__en-us_topic_0111499972_b842352706192820">'\n'</strong> and columns are separated by <strong id="dli_08_0236__en-us_topic_0111499972_b842352706192825">','</strong>.<p id="dli_08_0236__p1830410525560">To use the test data, create an <strong id="dli_08_0236__b939171654710">input.txt</strong> file, copy and paste the following text data, and save the file as <strong id="dli_08_0236__b1039131624717">input.csv</strong>. Upload the <strong id="dli_08_0236__b19618141854710">input.csv</strong> file to the target OBS bucket directory. For example, upload the file to the <strong id="dli_08_0236__b12876141924717">dli-test-obs01</strong> bucket directory.</p>
|
|
<pre class="screen" id="dli_08_0236__screen16550578564">1,2,3,4,1403149534
|
|
5,6,7,8,1403149535</pre>
|
|
<div class="p" id="dli_08_0236__p732379155814">The following is an example for creating the table:<div class="codecoloring" codetype="Sql" id="dli_08_0236__screen1389031912569"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span>
|
|
<span class="normal"> 2</span>
|
|
<span class="normal"> 3</span>
|
|
<span class="normal"> 4</span>
|
|
<span class="normal"> 5</span>
|
|
<span class="normal"> 6</span>
|
|
<span class="normal"> 7</span>
|
|
<span class="normal"> 8</span>
|
|
<span class="normal"> 9</span>
|
|
<span class="normal">10</span>
|
|
<span class="normal">11</span>
|
|
<span class="normal">12</span>
|
|
<span class="normal">13</span>
|
|
<span class="normal">14</span>
|
|
<span class="normal">15</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="k">SOURCE</span><span class="w"> </span><span class="n">STREAM</span><span class="w"> </span><span class="n">car_infos</span><span class="w"> </span><span class="p">(</span>
|
|
<span class="w"> </span><span class="n">car_id</span><span class="w"> </span><span class="n">STRING</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">car_owner</span><span class="w"> </span><span class="n">STRING</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">car_brand</span><span class="w"> </span><span class="n">STRING</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">car_price</span><span class="w"> </span><span class="nb">INT</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">car_timestamp</span><span class="w"> </span><span class="n">LONG</span>
|
|
<span class="p">)</span>
|
|
<span class="w"> </span><span class="k">WITH</span><span class="w"> </span><span class="p">(</span>
|
|
<span class="w"> </span><span class="k">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"obs"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">bucket</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"dli-test-obs01"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">region</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"xxx"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">object_name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"input.csv"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">row_delimiter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"\n"</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">field_delimiter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">","</span>
|
|
<span class="p">);</span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
</div>
|
|
</li><li id="dli_08_0236__li1866314183164">The <strong id="dli_08_0236__b733412215538">input.json</strong> file is read from the OBS bucket. Rows are separated by <strong id="dli_08_0236__b73355210536">'\n'</strong>.<pre class="screen" id="dli_08_0236__screen198053517179">CREATE SOURCE STREAM obs_source (
|
|
str STRING
|
|
)
|
|
WITH (
|
|
type = "obs",
|
|
bucket = "obssource",
|
|
region = "xxx",
|
|
encode = "json",
|
|
row_delimiter = "\n",
|
|
object_name = "input.json"
|
|
);</pre>
|
|
</li></ul>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dli_08_0234.html">Creating a Source Stream</a></div>
|
|
</div>
|
|
</div>
|
|
|