doc-exports/docs/dli/sqlreference/dli_08_0327.html
Su, Xiaomeng 04d4597cf3 dli_sqlreference_0511_version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
Co-committed-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
2023-11-02 14:34:08 +00:00

33 lines
4.5 KiB
HTML

<a name="dli_08_0327"></a><a name="dli_08_0327"></a>
<h1 class="topictitle1">Top-N</h1>
<div id="body8662426"><div class="section" id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_section1847317407576"><h4 class="sectiontitle">Function</h4><p id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_p1895644845520">Top-N queries ask for the N smallest or largest values ordered by columns. Both smallest and largest values sets are considered Top-N queries. Top-N queries are useful in cases where the need is to display only the N bottom-most or the N top- most records from batch/streaming table on a condition.</p>
</div>
<div class="section" id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_section1923719719576"><h4 class="sectiontitle">Syntax</h4><pre class="screen" id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_screen1177818114577">SELECT [column_list]
FROM (
SELECT [column_list],
ROW_NUMBER() OVER ([PARTITION BY col1[, col2...]]
ORDER BY col1 [asc|desc][, col2 [asc|desc]...]) AS rownum
FROM table_name)
WHERE rownum &lt;= N [AND conditions]</pre>
</div>
<div class="section" id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_section73114474281"><h4 class="sectiontitle">Description</h4><ul id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_ul459919916296"><li id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_li1159918916294">ROW_NUMBER(): Allocate a unique and consecutive number to each line starting from the first line in the current partition. Currently, we only support ROW_NUMBER as the over window function. In the future, we will support RANK() and DENSE_RANK().</li><li id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_li1559912972916">PARTITION BY col1[, col2...]: Specifies the partition columns. Each partition will have a Top-N result.</li><li id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_li1599895299">ORDER BY col1 [asc|desc][, col2 [asc|desc]...]: Specifies the ordering columns. The ordering directions can be different on different columns.</li><li id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_li1759909112917">WHERE rownum &lt;= N: The rownum &lt;= N is required for Flink to recognize this query is a Top-N query. The N represents the N smallest or largest records will be retained.</li><li id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_li3599194296">[AND conditions]: It is free to add other conditions in the where clause, but the other conditions can only be combined with rownum &lt;= N using AND conjunction.</li></ul>
</div>
<div class="section" id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_section436063555815"><h4 class="sectiontitle">Important Notes</h4><ul id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_ul133098366212"><li id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_li330910361529">The TopN query is Result Updating.</li><li id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_li5309183616210">Flink SQL will sort the input data stream according to the order key, </li><li id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_li183091336027">so if the top N records have been changed, the changed ones will be sent as retraction/update records to downstream.</li><li id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_li143108362211">If the top N records need to be stored in external storage, the result table should have the same unique key with the Top-N query.</li></ul>
</div>
<div class="section" id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_section8409144711587"><h4 class="sectiontitle">Example</h4><p id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_p16367113512584">This is an example to get the top five products per category that have the maximum sales in realtime.</p>
<pre class="screen" id="dli_08_0327__en-us_topic_0000001119072204_en-us_topic_0000001085569952_screen6367935195815">SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) as row_num
FROM ShopSales)
WHERE row_num &lt;= 5;</pre>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dli_08_0321.html">Data Manipulation Language (DML)</a></div>
</div>
</div>