Yang, Tong 6182f91ba8 MRS component operation guide_normal 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-09 14:55:21 +00:00

29 lines
5.7 KiB
HTML

<a name="mrs_01_0953"></a><a name="mrs_01_0953"></a>
<h1 class="topictitle1">Using HDFS Colocation to Store Hive Tables</h1>
<div id="body1590395281695"><div class="section" id="mrs_01_0953__section285118333383"><h4 class="sectiontitle">Scenario</h4><p id="mrs_01_0953__p14851203353817">HDFS Colocation is the data location control function provided by HDFS. The HDFS Colocation API stores associated data or data on which associated operations are performed on the same storage node. Hive supports the HDFS Colocation function. When Hive tables are created, after the locator information is set for table files, data files of related tables are stored on the same storage node when data is inserted into tables using the insert statement (other data import modes are not supported). This ensures convenient and efficient data computing among associated tables. The supported table formats are only TextFile and RCFile.</p>
<div class="note" id="mrs_01_0953__note145732815399"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="mrs_01_0953__p154806338390">This section applies to MRS 3.<em id="mrs_01_0953__i87225899252241">x</em> or later.</p>
</div></div>
</div>
<div class="section" id="mrs_01_0953__section6602102914387"><h4 class="sectiontitle">Procedure</h4><ol id="mrs_01_0953__ol08524339383"><li id="mrs_01_0953__li199441911396"><span>Log in to the node where the client is installed as a client installation user.</span></li><li id="mrs_01_0953__l8276fb953ede48e3bcae37889eabd834"><span>Run the following command to switch to the client installation directory, for example, <span class="filepath" id="mrs_01_0953__filepath279514561043"><b>opt/client</b></span>:</span><p><p id="mrs_01_0953__a7734bbc033c24c0e96a22957af2f367f"><strong id="mrs_01_0953__af1de2390d4ec4df6b0e28574d3329dff">cd /opt/client</strong></p>
</p></li><li id="mrs_01_0953__l5597fd376c554a00a433496294b4b278"><span>Run the following command to configure environment variables:</span><p><p id="mrs_01_0953__a988e7b87692b4a528f97f0da18a8f07e"><strong id="mrs_01_0953__a32ecf708d4cb4e4d9c017de49ec5b482">source bigdata_env</strong></p>
</p></li><li id="mrs_01_0953__ld661d1bcf5d74879a7d849cd7c0c1385"><span>If the cluster is in security mode, run the following command to authenticate the user:</span><p><p id="mrs_01_0953__p106368154427"><strong id="mrs_01_0953__b71613283715">kinit</strong> <em id="mrs_01_0953__i32118283714">MRS username</em></p>
</p></li><li id="mrs_01_0953__li1085212333384"><span>Create the <em id="mrs_01_0953__i198521433153810">groupid</em> through the HDFS API.</span><p><p class="litext" id="mrs_01_0953__p148527338389"><strong id="mrs_01_0953__b5852153318382">hdfs colocationadmin -createGroup -groupId <em id="mrs_01_0953__i7852173318386">&lt;groupid&gt;</em> -locatorIds <em id="mrs_01_0953__i11852163383811">&lt;locatorid1&gt;</em>,<em id="mrs_01_0953__i19852533123818">&lt;locatorid2&gt;</em>,<em id="mrs_01_0953__i58521133143812">&lt;locatorid3&gt;</em></strong></p>
<div class="note" id="mrs_01_0953__note13852123318389"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="mrs_01_0953__p685214333389">In the preceding command, <em id="mrs_01_0953__i85910143588">&lt;groupid&gt;</em> indicates the name of the created group. The group created in this example contains three locators. You can define the number of locators as required.</p>
<p class="text" id="mrs_01_0953__p188529333385">For details about group ID creation and HDFS Colocation, see HDFS description.</p>
</div></div>
</p></li><li id="mrs_01_0953__li1434310384368"><span>Run the following command to log in to the Hive client:</span><p><p id="mrs_01_0953__p1345154183714"><strong id="mrs_01_0953__b4301198183717">beeline</strong></p>
</p></li><li id="mrs_01_0953__li1085213317382"><span>Enable Hive to use colocation.</span><p><p id="mrs_01_0953__p98523337383">Assume that <strong id="mrs_01_0953__b19613122701">table_name1</strong> and <strong id="mrs_01_0953__b149371572008">table_name2</strong> are associated with each other. Run the following statements to create them:</p>
<p id="mrs_01_0953__p585214337380"><strong id="mrs_01_0953__b8852163319385">CREATE TABLE <em id="mrs_01_0953__i18852193320382">&lt;[db_name.]table_name1&gt;[(col_name data_type , ...)]</em> [ROW FORMAT <em id="mrs_01_0953__i285210331383">&lt;row_format&gt;</em>] [STORED AS <em id="mrs_01_0953__i1485243313815">&lt;file_format&gt;</em>] TBLPROPERTIES("groupId"=" <em id="mrs_01_0953__i6852133163810">&lt;group&gt;</em> ","locatorId"="<em id="mrs_01_0953__i17852163313813">&lt;locator1&gt;</em>");</strong></p>
<p id="mrs_01_0953__p168524337382"><strong id="mrs_01_0953__b14852183319387">CREATE TABLE <em id="mrs_01_0953__i208521733193810">&lt;[db_name.]table_name2&gt; [(col_name data_type , ...)]</em> [ROW FORMAT <em id="mrs_01_0953__i20852193312382">&lt;row_format&gt;</em>] [STORED AS <em id="mrs_01_0953__i13852333193810">&lt;file_format&gt;</em>] TBLPROPERTIES("groupId"=" <em id="mrs_01_0953__i1785233318382">&lt;group&gt;</em> ","locatorId"="<em id="mrs_01_0953__i1685293313381">&lt;locator1&gt;</em>");</strong></p>
<p id="mrs_01_0953__p4852183383813">After data is inserted into <strong id="mrs_01_0953__b87004435017">table_name1</strong> and <strong id="mrs_01_0953__b1970011436020">table_name2</strong> using the insert statement, data files of <strong id="mrs_01_0953__b118712551106">table_name1</strong> and <strong id="mrs_01_0953__b1987855709">table_name2</strong> are distributed to the same storage position in the HDFS, facilitating associated operations among the two tables.</p>
</p></li></ol>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_0581.html">Using Hive</a></div>
</div>
</div>