forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
82 lines
9.7 KiB
HTML
82 lines
9.7 KiB
HTML
<a name="mrs_01_0388"></a><a name="mrs_01_0388"></a>
|
|
|
|
<h1 class="topictitle1">Creating a CarbonData Table</h1>
|
|
<div id="body1589421630483"><div class="section" id="mrs_01_0388__s04fe989cc3e44e14b13bc7fa1584c13b"><h4 class="sectiontitle">Scenario</h4><p id="mrs_01_0388__abc0f7ac07bbf450ebdb95cfa6e7d515f">A CarbonData table must be created to load and query data.</p>
|
|
</div>
|
|
<div class="section" id="mrs_01_0388__s94d4db704b464d6dba4dfc22ebbb6600"><h4 class="sectiontitle">Creating a Table with Self-Defined Columns</h4><p id="mrs_01_0388__a3ad8a7430e104d3a96924a95a69b5847">Users can create a table by specifying its columns and data types. For analysis clusters with Kerberos authentication enabled, if a user wants to create a CarbonData table in a database other than the <span class="parmname" id="mrs_01_0388__parmname63628603155529"><b>default</b></span> database, the <span class="parmvalue" id="mrs_01_0388__parmvalue7727059155710"><b>Create</b></span> permission of the database must be added to the role to which the user is bound in Hive role management.</p>
|
|
<p id="mrs_01_0388__a6128a99f11294b96b9a8c23fcad1484f">Sample command:</p>
|
|
<p id="mrs_01_0388__a78ea7e68051a47a0abd929f3cea5772f"><strong id="mrs_01_0388__aa7db3c41b2544d92945b09e49abde51a">CREATE TABLE</strong> <strong id="mrs_01_0388__a4d8ff79733f54b7db49ce4d1c697a6de">IF NOT EXISTS productdb.productSalesTable (</strong></p>
|
|
<p id="mrs_01_0388__a53e91bb824a045cdbaa1bd2f3e3457ab"><strong id="mrs_01_0388__a8cc9a5f3fee14e4bbb2703e7b55bd7c4">productNumber Int,</strong></p>
|
|
<p id="mrs_01_0388__a74ed907e598e4a0f81c8ae63b1f80168"><strong id="mrs_01_0388__a6a08035c592d428d8ad00fe66a2cba20">productName String,</strong></p>
|
|
<p id="mrs_01_0388__a8fd3ccb92ffb468bbbcee3c25a6f4e52"><strong id="mrs_01_0388__ae190be0a5ae643589f360a75b107bc85">storeCity String,</strong></p>
|
|
<p id="mrs_01_0388__a069e4c4958c040579aa8f7e7bde8e7cb"><strong id="mrs_01_0388__ae7a39a73b75c44b1b800f5f4cc008da8">storeProvince String,</strong></p>
|
|
<p id="mrs_01_0388__ac5f6a6d1736f431ba59d8ca599d975cc"><strong id="mrs_01_0388__a99c6e3b410af4eb287497c0c1f273a10">revenue Int)</strong></p>
|
|
<p id="mrs_01_0388__a960f9d42bcea42f99ce4d4265f121e48"><strong id="mrs_01_0388__a6dcd6cea5cda4dc6af2c8a1d55a97d9e">STORED BY</strong> <em id="mrs_01_0388__aaa0498a660c549d1bba84f1cbb40b3f1">'</em><strong id="mrs_01_0388__ae57778253e314e7db1ce502660235464">org.apache.carbondata.format'</strong></p>
|
|
<p id="mrs_01_0388__a265177364848461c998dda2a055cc03b"><strong id="mrs_01_0388__a5ff0f055d3f7416395ff081656a483ae">TBLPROPERTIES (</strong></p>
|
|
<p id="mrs_01_0388__af848ae6206c441888bae2d2df2f0119c"><strong id="mrs_01_0388__a6c9c4272f8f045aa822b060888bbb00d">'table_blocksize'='128',</strong></p>
|
|
<p id="mrs_01_0388__en-us_topic_0056202763_p527830116372"><strong id="mrs_01_0388__aca53d3e0568c4d5fa2aa0fcc92d616fc">'DICTIONARY_EXCLUDE'='productName',</strong></p>
|
|
<p id="mrs_01_0388__ac95538f5670b4c29bcc33cf22acfd3a7"><strong id="mrs_01_0388__adb94bd3d09bc43a08ad5464815964521">'DICTIONARY_INCLUDE'='productNumber');</strong></p>
|
|
<p id="mrs_01_0388__en-us_topic_0056202763_p276038216372">The following table describes parameters of preceding commands.</p>
|
|
</div>
|
|
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_0388__tdb3ba8cd1bab4fdcb0a0da845c4f5f51" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_0388__r59952bcf5d084fadb741934035455513"><th align="left" class="cellrowborder" valign="top" width="29.32%" id="mcps1.3.3.2.3.1.1"><p id="mrs_01_0388__add7e7f04c2e64ee095458338f1de915c"><strong id="mrs_01_0388__b1212483817394">Parameter</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="70.67999999999999%" id="mcps1.3.3.2.3.1.2"><p id="mrs_01_0388__a8c45041613794ccc8f5df3510b7a651f"><strong id="mrs_01_0388__ab5884260115a4fc3be7e6de8fcc2d8fa">Description</strong></p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="mrs_01_0388__r55a8ee59077444a98114003627556cec"><td class="cellrowborder" valign="top" width="29.32%" headers="mcps1.3.3.2.3.1.1 "><p id="mrs_01_0388__ad66c651acc6d442597cf1604f54183f8">productSalesTable</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="70.67999999999999%" headers="mcps1.3.3.2.3.1.2 "><p id="mrs_01_0388__a5f0b744c1b214e069b3229cc592eeee8">Table name. The table is used to load data for analysis.</p>
|
|
<p id="mrs_01_0388__a67b8521959c545cc9dc890e876a4970c">The table name consists of letters, digits, and underscores (_).</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_0388__r7dd40895e5de451fa625520247f510e9"><td class="cellrowborder" valign="top" width="29.32%" headers="mcps1.3.3.2.3.1.1 "><p id="mrs_01_0388__ac5fdbe998dc44807b99b66e75a5164c3">productdb</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="70.67999999999999%" headers="mcps1.3.3.2.3.1.2 "><p id="mrs_01_0388__ae651ec505a8c41fb9d634294fd7afd13">Database name. The database maintains logical connections with tables stored in it to identify and manage the tables.</p>
|
|
<p id="mrs_01_0388__af51a69c7c7e242c2b984062949a2bbff">The database name consists of letters, digits, and underscores (_).</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_0388__r206baa4d82ae4b108f3f2e12de72a8cf"><td class="cellrowborder" valign="top" width="29.32%" headers="mcps1.3.3.2.3.1.1 "><p id="mrs_01_0388__en-us_topic_0056202763_p612995833313">productNumber</p>
|
|
<p id="mrs_01_0388__a823644c53de64c739f9dbff45238aa71">productName</p>
|
|
<p id="mrs_01_0388__a9483e634d8f54ed9a41d4cf044107de5">storeCity</p>
|
|
<p id="mrs_01_0388__af317714a9b9847e8b9dd0fa057343ae2">storeProvince</p>
|
|
<p id="mrs_01_0388__a805ac63767704dd5ad31d945e5742c83">revenue</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="70.67999999999999%" headers="mcps1.3.3.2.3.1.2 "><p id="mrs_01_0388__aa9525d1ef0334c7bb5eaee1a21a66f8c">Columns in the table. The columns are service entities for data analysis.</p>
|
|
<p id="mrs_01_0388__ad05b218e43fd4776a001a63d6f1ffd5c">The column name (field name) consists of letters, digits, and underscores (_).</p>
|
|
<div class="note" id="mrs_01_0388__note532314251324"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="mrs_01_0388__p6811324103">In CarbonData, you cannot configure a column's NOT NULL or default value, or the primary key of the table.</p>
|
|
</div></div>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_0388__r2ae65d69ffd547ce99cf800438a5d65b"><td class="cellrowborder" valign="top" width="29.32%" headers="mcps1.3.3.2.3.1.1 "><p id="mrs_01_0388__en-us_topic_0056202763_p602436163756">table_blocksize</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="70.67999999999999%" headers="mcps1.3.3.2.3.1.2 "><p id="mrs_01_0388__ae8a81772a89e44f09cf1bd2206179f9f">Block size of data files used by the CarbonData table. The value ranges from 1 MB to 2048 MB. The default is 1024 MB.</p>
|
|
<ul id="mrs_01_0388__u109abaceaf8c4fbaa9ffc1b98510aef2"><li id="mrs_01_0388__l9c64c8770e92447e80118222b14607c6">If the value of <strong id="mrs_01_0388__en-us_topic_0056202763_b842352706141037">table_blocksize</strong> is too small, a large number of small files will be generated when data is loaded. This may affect the performance in using HDFS.</li><li id="mrs_01_0388__lf68b6c65357e49a7b8427da16e616e4e">If the value of <strong id="mrs_01_0388__en-us_topic_0056202763_b1013963893141356">table_blocksize</strong> is too large, a large volume of data must be read from a block and the read concurrency is low when data is queried. As a result, the query performance deteriorates.</li></ul>
|
|
<p id="mrs_01_0388__adc72ed51abe64a059e4e87a66c353b31">You are advised to set the block size based on the data volume. For example, set the block size to 256 MB for GB-level data, 512 MB for TB-level data, and 1024 MB for PB-level data.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_0388__r4b77ff362bed41d3a0967e7e3dc05a52"><td class="cellrowborder" valign="top" width="29.32%" headers="mcps1.3.3.2.3.1.1 "><p id="mrs_01_0388__a2134a14aaab24cfe8440d5f758cf2f4c">DICTIONARY_EXCLUDE</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="70.67999999999999%" headers="mcps1.3.3.2.3.1.2 "><p id="mrs_01_0388__a5e449955b1fe48b0ae209a88dd5be1ae">Specifies the columns that do not generate dictionaries. This function is optional and applicable to columns of high complexity. By default, the system generates dictionaries for columns of the String type. However, as the number of values in the dictionaries increases, conversion operations by the dictionaries increase and the system performance deteriorates.</p>
|
|
<p id="mrs_01_0388__a888378f915f54cb2af4b201482ff6daf">Generally, if a column has over 50,000 unique data records, it is considered as a highly complex column and dictionary generation must be disabled.</p>
|
|
<div class="note" id="mrs_01_0388__nc292433cf7994dcbbe4c010fa31c2bc2"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="mrs_01_0388__a75f7557253fd44cabdb3ae917734c5a9">Non-dictionary columns support only the String and Timestamp data types.</p>
|
|
</div></div>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_0388__r63bef83b23d94048b9c8d11ca8f14929"><td class="cellrowborder" valign="top" width="29.32%" headers="mcps1.3.3.2.3.1.1 "><p id="mrs_01_0388__a299200c13660457ea4ed18fcfb6ca7a1">DICTIONARY_INCLUDE</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="70.67999999999999%" headers="mcps1.3.3.2.3.1.2 "><p id="mrs_01_0388__a0d526d7d713a44da91a9127b169fd53e">Specifies the columns that generate dictionaries. This function is optional and applicable to columns of low complexity. It improves the performance of queries with the <strong id="mrs_01_0388__a882237625b93435daa8b89f8e6726fb5">groupby</strong> condition. Generally, the complexity of a dictionary column cannot exceed 50,000.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_0385.html">Using CarbonData (for Versions Earlier Than MRS 3.x)</a></div>
|
|
</div>
|
|
</div>
|
|
|