doc-exports/docs/mrs/component-operation-guide/mrs_01_1996.html

<a name="mrs_01_1996"></a><a name="mrs_01_1996"></a>

<h1 class="topictitle1">Optimizing the Aggregate Algorithms</h1>
<div id="body1595920218806"><div class="section" id="mrs_01_1996__s90352889825547da8a4e4619c098d461"><h4 class="sectiontitle">Scenario</h4><p id="mrs_01_1996__a64aa6614e4274e24b2e2de753b516b8d">Spark SQL supports hash aggregate algorithm. Namely, use fast aggregate hashmap as cache to improve aggregate performance. The hashmap replaces the previous ColumnarBatch to avoid performance problems caused by the wide mode (multiple key or value fields) of an aggregate table.</p>
</div>
<div class="section" id="mrs_01_1996__sb56d9be516f9454780cead1a6fe61876"><h4 class="sectiontitle">Procedure</h4><p id="mrs_01_1996__a307bcceb4f4c4049b2c00fc0e369deb2">If you want to enable optimization of aggregate algorithm, configure following parameters in the <strong id="mrs_01_1996__b4454848115510">spark-defaults.conf</strong> file on the Spark client.</p>

<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1996__t01efc2057d83465ba6100892e94d9bd7" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1996__r72f49c3995d44cffbbc88f8028342237"><th align="left" class="cellrowborder" valign="top" width="36.4%" id="mcps1.3.2.3.2.4.1.1"><p id="mrs_01_1996__a9f4d7e009ce44fefbdc8cf0e603a61b0">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50.88%" id="mcps1.3.2.3.2.4.1.2"><p id="mrs_01_1996__a51c53fc1b66e4ecb940b98ddaed5da72">Description</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.72%" id="mcps1.3.2.3.2.4.1.3"><p id="mrs_01_1996__a0e559043d9a34e27b3341effcb9c8bee">Default Value</p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1996__r7eb2288b794e400589837b2b2e6e9605"><td class="cellrowborder" valign="top" width="36.4%" headers="mcps1.3.2.3.2.4.1.1 "><p id="mrs_01_1996__adb9bb829b4a947019f57d7bea966e802">spark.sql.codegen.aggregate.map.twolevel.enabled</p>
</td>
<td class="cellrowborder" valign="top" width="50.88%" headers="mcps1.3.2.3.2.4.1.2 "><p id="mrs_01_1996__ab5f729b79fba4004995abc5788b1177d">Specifies whether to enable aggregation algorithm optimization.</p>
<ul id="mrs_01_1996__u5dd0451879ee4821a62b556eef97f918"><li id="mrs_01_1996__la7ce44aa8b6b4d5b9f5bf4d9c3df8a85"><strong id="mrs_01_1996__b8320145019568">true</strong>: Enable</li><li id="mrs_01_1996__l294824ef9ee24202874d08f898b3f7c7"><strong id="mrs_01_1996__b616652105613">false</strong>: Disable</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.72%" headers="mcps1.3.2.3.2.4.1.3 "><p id="mrs_01_1996__a11d99653fe594768bdb9872dd122ca03">true</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1985.html">Spark SQL and DataFrame Tuning</a></div>
</div>
</div>