doc-exports/docs/dws/dev/dws_04_0478.html
Lu, Huayi a24ca60074 DWS DEVELOPER 811 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Lu, Huayi <luhuayi@huawei.com>
Co-committed-by: Lu, Huayi <luhuayi@huawei.com>
2023-01-19 13:37:49 +00:00

99 lines
22 KiB
HTML

<a name="EN-US_TOPIC_0000001145694687"></a><a name="EN-US_TOPIC_0000001145694687"></a>
<h1 class="topictitle1">Case: Pushing Down Sort Operations to DNs</h1>
<div id="body8662426"><div class="section" id="EN-US_TOPIC_0000001145694687__s80c1ef7ae501441f8892d3f573764ab7"><h4 class="sectiontitle">Symptom</h4><p id="EN-US_TOPIC_0000001145694687__a339e0e2ca8204993a303481fb10309bf">In an execution plan, more than 95% of the execution time is spent on <strong id="EN-US_TOPIC_0000001145694687__b842352706101324">window agg</strong> performed on the CN. In this case, <strong id="EN-US_TOPIC_0000001145694687__b33850014101445">sum</strong> is performed for the two columns separately, and then another <strong id="EN-US_TOPIC_0000001145694687__b765812707101445">sum</strong> is performed for the separate sum results of the two columns. After this, trunc and sorting are performed in sequence.</p>
<p id="EN-US_TOPIC_0000001145694687__a8c03cf62a5f54463b0ca2609310a317c">The table structure is as follows:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001145694687__s7e81264a6fb642a49658b0e6eaac1be0"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="k">public</span><span class="p">.</span><span class="n">test</span><span class="p">(</span><span class="n">imsi</span><span class="w"> </span><span class="nb">int</span><span class="p">,</span><span class="n">L4_DW_THROUGHPUT</span><span class="w"> </span><span class="nb">int</span><span class="p">,</span><span class="n">L4_UL_THROUGHPUT</span><span class="w"> </span><span class="nb">int</span><span class="p">)</span><span class="w"></span>
<span class="k">with</span><span class="w"> </span><span class="p">(</span><span class="n">orientation</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">column</span><span class="p">)</span><span class="w"> </span><span class="n">DISTRIBUTE</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">hash</span><span class="p">(</span><span class="n">imsi</span><span class="p">);</span><span class="w"></span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001145694687__a97d31663007c48cfb6535fcb7fd7a405">The query statements are as follows:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001145694687__s4dd10d216b114b499b7530c10aff20f4"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span>
<span class="normal">6</span>
<span class="normal">7</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="k">COUNT</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span><span class="w"> </span><span class="n">over</span><span class="p">()</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">DATACNT</span><span class="p">,</span><span class="w"></span>
<span class="n">IMSI</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">IMSI_IMSI</span><span class="p">,</span><span class="w"></span>
<span class="k">CAST</span><span class="p">(</span><span class="n">TRUNC</span><span class="p">(((</span><span class="k">SUM</span><span class="p">(</span><span class="n">L4_UL_THROUGHPUT</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="k">SUM</span><span class="p">(</span><span class="n">L4_DW_THROUGHPUT</span><span class="p">))),</span><span class="w"> </span><span class="mi">0</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"></span>
<span class="nb">DECIMAL</span><span class="p">(</span><span class="mi">20</span><span class="p">))</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">TOTAL_VOLOME_KPIID</span><span class="w"></span>
<span class="k">FROM</span><span class="w"> </span><span class="k">public</span><span class="p">.</span><span class="n">test</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">test</span><span class="w"></span>
<span class="k">GROUP</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">IMSI</span><span class="w"></span>
<span class="k">order</span><span class="w"> </span><span class="k">by</span><span class="w"> </span><span class="n">TOTAL_VOLOME_KPIID</span><span class="w"> </span><span class="k">DESC</span><span class="p">;</span><span class="w"></span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001145694687__abbdd8079b6234a5e8532d40af8bd4ca6">The execution plan is as follows:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001145694687__sfccb5d66d9a143c8a04ed788f7a1d87d"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span>
<span class="normal">6</span>
<span class="normal">7</span>
<span class="normal">8</span>
<span class="normal">9</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">Row</span><span class="w"> </span><span class="n">Adapter</span><span class="w"> </span><span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">10</span><span class="p">.</span><span class="mi">70</span><span class="p">..</span><span class="mi">10</span><span class="p">.</span><span class="mi">70</span><span class="w"> </span><span class="k">rows</span><span class="o">=</span><span class="mi">10</span><span class="w"> </span><span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="n">Vector</span><span class="w"> </span><span class="n">Sort</span><span class="w"> </span><span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">10</span><span class="p">.</span><span class="mi">68</span><span class="p">..</span><span class="mi">10</span><span class="p">.</span><span class="mi">70</span><span class="w"> </span><span class="k">rows</span><span class="o">=</span><span class="mi">10</span><span class="w"> </span><span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="n">Sort</span><span class="w"> </span><span class="k">Key</span><span class="p">:</span><span class="w"> </span><span class="p">((</span><span class="n">trunc</span><span class="p">((((</span><span class="k">sum</span><span class="p">(</span><span class="n">l4_ul_throughput</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="p">(</span><span class="k">sum</span><span class="p">(</span><span class="n">l4_dw_throughput</span><span class="p">))))::</span><span class="nb">numeric</span><span class="p">,</span><span class="w"> </span><span class="mi">0</span><span class="p">))::</span><span class="nb">numeric</span><span class="p">(</span><span class="mi">20</span><span class="p">,</span><span class="mi">0</span><span class="p">))</span><span class="w"></span>
<span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="n">Vector</span><span class="w"> </span><span class="n">WindowAgg</span><span class="w"> </span><span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">10</span><span class="p">.</span><span class="mi">09</span><span class="p">..</span><span class="mi">10</span><span class="p">.</span><span class="mi">51</span><span class="w"> </span><span class="k">rows</span><span class="o">=</span><span class="mi">10</span><span class="w"> </span><span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="n">Vector</span><span class="w"> </span><span class="n">Streaming</span><span class="w"> </span><span class="p">(</span><span class="k">type</span><span class="p">:</span><span class="w"> </span><span class="n">GATHER</span><span class="p">)</span><span class="w"> </span><span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">242</span><span class="p">.</span><span class="mi">04</span><span class="p">..</span><span class="mi">246</span><span class="p">.</span><span class="mi">84</span><span class="w"> </span><span class="k">rows</span><span class="o">=</span><span class="mi">240</span><span class="w"> </span><span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="n">Node</span><span class="o">/</span><span class="n">s</span><span class="p">:</span><span class="w"> </span><span class="k">All</span><span class="w"> </span><span class="n">datanodes</span><span class="w"></span>
<span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="n">Vector</span><span class="w"> </span><span class="n">Hash</span><span class="w"> </span><span class="k">Aggregate</span><span class="w"> </span><span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">10</span><span class="p">.</span><span class="mi">09</span><span class="p">..</span><span class="mi">10</span><span class="p">.</span><span class="mi">29</span><span class="w"> </span><span class="k">rows</span><span class="o">=</span><span class="mi">10</span><span class="w"> </span><span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="k">Group</span><span class="w"> </span><span class="k">By</span><span class="w"> </span><span class="k">Key</span><span class="p">:</span><span class="w"> </span><span class="n">imsi</span><span class="w"></span>
<span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="n">CStore</span><span class="w"> </span><span class="n">Scan</span><span class="w"> </span><span class="k">on</span><span class="w"> </span><span class="n">test</span><span class="w"> </span><span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">10</span><span class="p">.</span><span class="mi">01</span><span class="w"> </span><span class="k">rows</span><span class="o">=</span><span class="mi">10</span><span class="w"> </span><span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span><span class="w"></span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001145694687__aa86526f07f2c4db593c6c00dd0cc333c">As we can see, both <strong id="EN-US_TOPIC_0000001145694687__b842352706102813">window agg</strong> and <strong id="EN-US_TOPIC_0000001145694687__b842352706102817">sort</strong> are performed on the CN, which is time consuming.</p>
</div>
<div class="section" id="EN-US_TOPIC_0000001145694687__s07a0d4dc14b84659a1e0a2dc74bdda35"><h4 class="sectiontitle">Optimization Analysis</h4><p id="EN-US_TOPIC_0000001145694687__a344b553685034bd9b4e2e461ab82d888">Modify the statement to a subquery statement, as shown below:</p>
</div>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001145694687__s8d87a9b6823a4e12a4f7c3a93f40bec9"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span>
<span class="normal">6</span>
<span class="normal">7</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="k">COUNT</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span><span class="w"> </span><span class="n">over</span><span class="p">()</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">DATACNT</span><span class="p">,</span><span class="w"> </span><span class="n">IMSI_IMSI</span><span class="p">,</span><span class="w"> </span><span class="n">TOTAL_VOLOME_KPIID</span><span class="w"></span>
<span class="k">FROM</span><span class="w"> </span><span class="p">(</span><span class="k">SELECT</span><span class="w"> </span><span class="n">IMSI</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">IMSI_IMSI</span><span class="p">,</span><span class="w"></span>
<span class="k">CAST</span><span class="p">(</span><span class="n">TRUNC</span><span class="p">(((</span><span class="k">SUM</span><span class="p">(</span><span class="n">L4_UL_THROUGHPUT</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="k">SUM</span><span class="p">(</span><span class="n">L4_DW_THROUGHPUT</span><span class="p">))),</span><span class="w"></span>
<span class="mi">0</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="nb">DECIMAL</span><span class="p">(</span><span class="mi">20</span><span class="p">))</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">TOTAL_VOLOME_KPIID</span><span class="w"></span>
<span class="k">FROM</span><span class="w"> </span><span class="k">public</span><span class="p">.</span><span class="n">test</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">test</span><span class="w"></span>
<span class="k">GROUP</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">IMSI</span><span class="w"></span>
<span class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">TOTAL_VOLOME_KPIID</span><span class="w"> </span><span class="k">DESC</span><span class="p">);</span><span class="w"></span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001145694687__afedc67c060a24b86beb53c888a643fa5">Perform <strong id="EN-US_TOPIC_0000001145694687__b842352706103023">sum</strong> on the <strong id="EN-US_TOPIC_0000001145694687__b842352706103028">trunc</strong> results of the two columns, take it as a subquery, and then perform <strong id="EN-US_TOPIC_0000001145694687__b84235270610316">window agg</strong> for the subquery to push down the sorting operation to DNs, as shown below:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001145694687__se57e402701f647b29055868016e60e52"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span>
<span class="normal">6</span>
<span class="normal">7</span>
<span class="normal">8</span>
<span class="normal">9</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">Row</span><span class="w"> </span><span class="n">Adapter</span><span class="w"> </span><span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">10</span><span class="p">.</span><span class="mi">70</span><span class="p">..</span><span class="mi">10</span><span class="p">.</span><span class="mi">70</span><span class="w"> </span><span class="k">rows</span><span class="o">=</span><span class="mi">10</span><span class="w"> </span><span class="n">width</span><span class="o">=</span><span class="mi">24</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="n">Vector</span><span class="w"> </span><span class="n">WindowAgg</span><span class="w"> </span><span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">10</span><span class="p">.</span><span class="mi">45</span><span class="p">..</span><span class="mi">10</span><span class="p">.</span><span class="mi">70</span><span class="w"> </span><span class="k">rows</span><span class="o">=</span><span class="mi">10</span><span class="w"> </span><span class="n">width</span><span class="o">=</span><span class="mi">24</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="n">Vector</span><span class="w"> </span><span class="n">Streaming</span><span class="w"> </span><span class="p">(</span><span class="k">type</span><span class="p">:</span><span class="w"> </span><span class="n">GATHER</span><span class="p">)</span><span class="w"> </span><span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">250</span><span class="p">.</span><span class="mi">83</span><span class="p">..</span><span class="mi">253</span><span class="p">.</span><span class="mi">83</span><span class="w"> </span><span class="k">rows</span><span class="o">=</span><span class="mi">240</span><span class="w"> </span><span class="n">width</span><span class="o">=</span><span class="mi">24</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="n">Node</span><span class="o">/</span><span class="n">s</span><span class="p">:</span><span class="w"> </span><span class="k">All</span><span class="w"> </span><span class="n">datanodes</span><span class="w"></span>
<span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="n">Vector</span><span class="w"> </span><span class="n">Sort</span><span class="w"> </span><span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">10</span><span class="p">.</span><span class="mi">45</span><span class="p">..</span><span class="mi">10</span><span class="p">.</span><span class="mi">48</span><span class="w"> </span><span class="k">rows</span><span class="o">=</span><span class="mi">10</span><span class="w"> </span><span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="n">Sort</span><span class="w"> </span><span class="k">Key</span><span class="p">:</span><span class="w"> </span><span class="p">((</span><span class="n">trunc</span><span class="p">(((</span><span class="k">sum</span><span class="p">(</span><span class="n">test</span><span class="p">.</span><span class="n">l4_ul_throughput</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="k">sum</span><span class="p">(</span><span class="n">test</span><span class="p">.</span><span class="n">l4_dw_throughput</span><span class="p">)))::</span><span class="nb">numeric</span><span class="p">,</span><span class="w"> </span><span class="mi">0</span><span class="p">))::</span><span class="nb">numeric</span><span class="p">(</span><span class="mi">20</span><span class="p">,</span><span class="mi">0</span><span class="p">))</span><span class="w"></span>
<span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="n">Vector</span><span class="w"> </span><span class="n">Hash</span><span class="w"> </span><span class="k">Aggregate</span><span class="w"> </span><span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">10</span><span class="p">.</span><span class="mi">09</span><span class="p">..</span><span class="mi">10</span><span class="p">.</span><span class="mi">29</span><span class="w"> </span><span class="k">rows</span><span class="o">=</span><span class="mi">10</span><span class="w"> </span><span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="k">Group</span><span class="w"> </span><span class="k">By</span><span class="w"> </span><span class="k">Key</span><span class="p">:</span><span class="w"> </span><span class="n">test</span><span class="p">.</span><span class="n">imsi</span><span class="w"></span>
<span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="n">CStore</span><span class="w"> </span><span class="n">Scan</span><span class="w"> </span><span class="k">on</span><span class="w"> </span><span class="n">test</span><span class="w"> </span><span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">10</span><span class="p">.</span><span class="mi">01</span><span class="w"> </span><span class="k">rows</span><span class="o">=</span><span class="mi">10</span><span class="w"> </span><span class="n">width</span><span class="o">=</span><span class="mi">12</span><span class="p">)</span><span class="w"></span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001145694687__ae95329c4cedc47a3a4aefe3e12bb3af6">The optimized SQL statement greatly improves the performance by reducing the execution time from 120s to 7s.</p>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dws_04_0474.html">Optimization Cases</a></div>
</div>
</div>