Yang, Tong 6182f91ba8 MRS component operation guide_normal 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-09 14:55:21 +00:00

37 lines
5.1 KiB
HTML

<a name="mrs_01_1953"></a><a name="mrs_01_1953"></a>
<h1 class="topictitle1">Configuring the Compression Format of a Parquet Table</h1>
<div id="body1595920207018"><div class="section" id="mrs_01_1953__sb9782cc8df7b402c9c759768ef4ccc88"><h4 class="sectiontitle">Scenarios</h4><p id="mrs_01_1953__afb3a6dfe9c644ef1b26f3913e7c8619f">The compression format of a Parquet table can be configured as follows:</p>
<ol id="mrs_01_1953__o1b356e1387a74739a9cd6a25b9766001"><li id="mrs_01_1953__la5877b53a9804c439904704ab592fa73">If the Parquet table is a partitioned one, set the <span class="parmname" id="mrs_01_1953__p6d3ac3fe45504dea96e22e6aa9529f75"><b>parquet.compression</b></span> parameter of the Parquet table to specify the compression format. For example, set <strong id="mrs_01_1953__b201292206811212">tblproperties</strong> in the table creation statement: <strong id="mrs_01_1953__b54082256011212">"parquet.compression"="snappy"</strong>.</li><li id="mrs_01_1953__l9b629a140c4f4a42ad40718465a28e91">If the Parquet table is a non-partitioned one, set the <span class="parmname" id="mrs_01_1953__p1f6f41ef7b524767801a60261ca76153"><b>spark.sql.parquet.compression.codec</b></span> parameter to specify the compression format. The configuration of the <span class="parmname" id="mrs_01_1953__parmname82385254911212"><b>parquet.compression</b></span> parameter is invalid, because the value of the <span class="parmname" id="mrs_01_1953__parmname134005212311212"><b>spark.sql.parquet.compression.codec</b></span> parameter is read by the <strong id="mrs_01_1953__b139246753511212">parquet.compression</strong> parameter. If the <strong id="mrs_01_1953__b111998973111212">spark.sql.parquet.compression.codec</strong> parameter is not configured, the default value is <span class="parmvalue" id="mrs_01_1953__parmvalue9954191911212"><b>snappy</b></span> and will be read by the <strong id="mrs_01_1953__b186989484911212">parquet.compression</strong> parameter.</li></ol>
<p id="mrs_01_1953__a9dbe62d94ea740c096b13d8be9240af2">Therefore, the <span class="parmname" id="mrs_01_1953__p9cecf0385c21404d9b0e8bf2efcd470d"><b>spark.sql.parquet.compression.codec</b></span> parameter can only be used to set the compression format of a non-partitioned Parquet table.</p>
</div>
<div class="section" id="mrs_01_1953__s9181d4d09e3b4270a0218c13af72728f"><h4 class="sectiontitle">Configuration parameters</h4><p id="mrs_01_1953__a4c165bd6f2ea4ad6bb64b2256802c372"><strong id="mrs_01_1953__ac80841aa393143599ce5b693f4435658">Navigation path for setting parameters:</strong></p>
<p id="mrs_01_1953__a8031164794c248e7962244b6528824c9">On Manager, choose <span class="menucascade" id="mrs_01_1953__m3d3977285eb444988432992fc2945ff2"><b><span class="uicontrol" id="mrs_01_1953__u288e6a49d8364f9d9ab14c37aa655e66"><span id="mrs_01_1953__text65762040388">Cluster &gt; <em id="mrs_01_1953__i126851111165912">Name of the desired cluster</em> &gt; </span>Service</span></b> &gt; <b><span class="uicontrol" id="mrs_01_1953__uda8571fbf04841f2ad30844eeada5cfb">Spark2x</span></b> &gt; <b><span class="uicontrol" id="mrs_01_1953__uadafc3bd57aa41d7a025861393b654da">Configuration</span></b></span>. Click <span class="parmvalue" id="mrs_01_1953__p941f9b7cca554f02823c528997910dc4"><b>All Configurations</b></span> and enter a parameter name in the search box.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1953__t612728f379e14db88b1203cf2066ba20" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1953__rfc5e84f934f24c599a07a84ed8ae88c6"><th align="left" class="cellrowborder" valign="top" width="36.4%" id="mcps1.3.2.4.2.4.1.1"><p id="mrs_01_1953__aace5cb0e28914f7ea4c101a7ea63a866"><strong id="mrs_01_1953__abccbd715026649ccb7d6f3c79bba4b85">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50.88%" id="mcps1.3.2.4.2.4.1.2"><p id="mrs_01_1953__ab770a9ae78694e3a8970b91c4a94f7f1"><strong id="mrs_01_1953__a5829ef0179f14b0a820e3eea4a56126a">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.72%" id="mcps1.3.2.4.2.4.1.3"><p id="mrs_01_1953__ab58b7f73935546bf918dd1f229e707e4"><strong id="mrs_01_1953__a8d19df4f62264e29ab673e34367a690e">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1953__r292a6765142a48a2828b6cbfc00e96fb"><td class="cellrowborder" valign="top" width="36.4%" headers="mcps1.3.2.4.2.4.1.1 "><p id="mrs_01_1953__adf145bf516054f9f89448d1696e31fc5">spark.sql.parquet.compression.codec</p>
</td>
<td class="cellrowborder" valign="top" width="50.88%" headers="mcps1.3.2.4.2.4.1.2 "><p id="mrs_01_1953__a31992e15fec7445dab478549baddcf93">Used to set the compression format of a non-partitioned Parquet table.</p>
</td>
<td class="cellrowborder" valign="top" width="12.72%" headers="mcps1.3.2.4.2.4.1.3 "><p id="mrs_01_1953__a8c47e5a26c924d58adcbcf9475b2de6a">snappy</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1941.html">Scenario-Specific Configuration</a></div>
</div>
</div>