forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Su, Xiaomeng <suxiaomeng1@huawei.com> Co-committed-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
136 lines
12 KiB
HTML
136 lines
12 KiB
HTML
<a name="dli_spark_lag"></a><a name="dli_spark_lag"></a>
|
|
|
|
<h1 class="topictitle1">lag</h1>
|
|
<div id="body8662426"><p id="dli_spark_lag__en-us_topic_0000001655381194_p18471163815215">This function is used to return the value of the <em id="dli_spark_lag__en-us_topic_0000001655381194_i36111714455">n</em>th row upwards within a specified window.</p>
|
|
<div class="section" id="dli_spark_lag__en-us_topic_0000001655381194_section11889111920500"><h4 class="sectiontitle">Restrictions</h4><p id="dli_spark_lag__en-us_topic_0000001655381194_p12992104012188">The restrictions on using window functions are as follows:</p>
|
|
<ul id="dli_spark_lag__en-us_topic_0000001655381194_ul999316407188"><li id="dli_spark_lag__en-us_topic_0000001655381194_li17993174071811">Window functions can be used only in select statements.</li><li id="dli_spark_lag__en-us_topic_0000001655381194_li17993340171819">Window functions and aggregate functions cannot be nested in window functions.</li><li id="dli_spark_lag__en-us_topic_0000001655381194_li139936406189">Window functions cannot be used together with aggregate functions of the same level.</li></ul>
|
|
</div>
|
|
<div class="section" id="dli_spark_lag__en-us_topic_0000001655381194_section45291954203217"><h4 class="sectiontitle">Syntax</h4><pre class="screen" id="dli_spark_lag__en-us_topic_0000001655381194_screen1599344019183">lag(<expr>[, bigint <offset>[, <default>]]) over([partition_clause] orderby_clause)</pre>
|
|
</div>
|
|
<div class="section" id="dli_spark_lag__en-us_topic_0000001655381194_section992014913317"><h4 class="sectiontitle">Parameters</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="dli_spark_lag__en-us_topic_0000001655381194_table1829154762513" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameters</caption><thead align="left"><tr id="dli_spark_lag__en-us_topic_0000001655381194_row8830104792517"><th align="left" class="cellrowborder" valign="top" width="22.759999999999998%" id="mcps1.3.4.2.2.4.1.1"><p id="dli_spark_lag__en-us_topic_0000001655381194_p983074711252">Parameter</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="15.920000000000002%" id="mcps1.3.4.2.2.4.1.2"><p id="dli_spark_lag__en-us_topic_0000001655381194_p6830124732517">Mandatory</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="61.31999999999999%" id="mcps1.3.4.2.2.4.1.3"><p id="dli_spark_lag__en-us_topic_0000001655381194_p08301547132513">Description</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="dli_spark_lag__en-us_topic_0000001655381194_row15830184792511"><td class="cellrowborder" valign="top" width="22.759999999999998%" headers="mcps1.3.4.2.2.4.1.1 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p683034714250">expr</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="15.920000000000002%" headers="mcps1.3.4.2.2.4.1.2 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p12830184752518">Yes</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="61.31999999999999%" headers="mcps1.3.4.2.2.4.1.3 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p13350982431">Expression whose return result is to be calculated</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_spark_lag__en-us_topic_0000001655381194_row1578218624013"><td class="cellrowborder" valign="top" width="22.759999999999998%" headers="mcps1.3.4.2.2.4.1.1 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p1782146134018">offset</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="15.920000000000002%" headers="mcps1.3.4.2.2.4.1.2 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p9782176164012">No</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="61.31999999999999%" headers="mcps1.3.4.2.2.4.1.3 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p177826611408">Offset. It is a constant of the BIGINT type and its value is greater than or equal to 0. The value <strong id="dli_spark_lag__en-us_topic_0000001655381194_b118002081432">0</strong> indicates the current row, the value <strong id="dli_spark_lag__en-us_topic_0000001655381194_b59081101435">1</strong> indicates the previous row, and so on. The default value is <strong id="dli_spark_lag__en-us_topic_0000001655381194_b650145111433">1</strong>. If the input value is of the STRING or DOUBLE type, it is implicitly converted to the BIGINT type before calculation.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_spark_lag__en-us_topic_0000001655381194_row4273191014405"><td class="cellrowborder" valign="top" width="22.759999999999998%" headers="mcps1.3.4.2.2.4.1.1 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p627314107402">default</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="15.920000000000002%" headers="mcps1.3.4.2.2.4.1.2 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p52741410134011">Yes</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="61.31999999999999%" headers="mcps1.3.4.2.2.4.1.3 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p147209415263">Constant. The default value is <strong id="dli_spark_lag__en-us_topic_0000001655381194_b112411820104414">NULL</strong>.</p>
|
|
<p id="dli_spark_lag__en-us_topic_0000001655381194_p72742010184016">Default value when the range specified by <strong id="dli_spark_lag__en-us_topic_0000001655381194_b17481193613446">offset</strong> is out of range. The value must be the same as the data type corresponding to <strong id="dli_spark_lag__en-us_topic_0000001655381194_b128441928454">expr</strong>. If <strong id="dli_spark_lag__en-us_topic_0000001655381194_b8121213104519">expr</strong> is non-constant, the evaluation is performed based on the current row.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_spark_lag__en-us_topic_0000001655381194_row1160161417206"><td class="cellrowborder" valign="top" width="22.759999999999998%" headers="mcps1.3.4.2.2.4.1.1 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p181421114175510">partition_clause</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="15.920000000000002%" headers="mcps1.3.4.2.2.4.1.2 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p1214218144555">No</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="61.31999999999999%" headers="mcps1.3.4.2.2.4.1.3 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p1114218146554">Partition. Rows with the same value in partition columns are considered to be in the same window.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_spark_lag__en-us_topic_0000001655381194_row316019148205"><td class="cellrowborder" valign="top" width="22.759999999999998%" headers="mcps1.3.4.2.2.4.1.1 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p12181324182913">orderby_clause</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="15.920000000000002%" headers="mcps1.3.4.2.2.4.1.2 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p91821324192918">No</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="61.31999999999999%" headers="mcps1.3.4.2.2.4.1.3 "><p id="dli_spark_lag__en-us_topic_0000001655381194_p418272442914">It is used to specify how data is sorted in a window.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="dli_spark_lag__en-us_topic_0000001655381194_section210162513312"><h4 class="sectiontitle">Return Values</h4><p id="dli_spark_lag__en-us_topic_0000001655381194_p184124317231">The return value is of the data type of the parameter.</p>
|
|
</div>
|
|
<div class="section" id="dli_spark_lag__en-us_topic_0000001655381194_section13277192233920"><h4 class="sectiontitle">Example Code</h4><p id="dli_spark_lag__en-us_topic_0000001655381194_p1599364015184"><strong id="dli_spark_lag__en-us_topic_0000001655381194_b200980796131540">Example data</strong></p>
|
|
<div class="p" id="dli_spark_lag__en-us_topic_0000001655381194_p499394020187">To help you understand how to use functions, this example provides source data and function examples based on the source data. Run the following command to create the <strong id="dli_spark_lag__en-us_topic_0000001655381194_b11516249043164">logs</strong> table and add data:<pre class="screen" id="dli_spark_lag__en-us_topic_0000001655381194_screen16993440101818">create table logs(
|
|
cookieid string,
|
|
createtime string,
|
|
url string
|
|
)
|
|
STORED AS parquet;</pre>
|
|
</div>
|
|
<p id="dli_spark_lag__en-us_topic_0000001655381194_p159939404187">Adds the following data:</p>
|
|
<pre class="screen" id="dli_spark_lag__en-us_topic_0000001655381194_screen1599344012186">cookie1 2015-04-10 10:00:02 url2
|
|
cookie1 2015-04-10 10:00:00 url1
|
|
cookie1 2015-04-10 10:03:04 url3
|
|
cookie1 2015-04-10 10:50:05 url6
|
|
cookie1 2015-04-10 11:00:00 url7
|
|
cookie1 2015-04-10 10:10:00 url4
|
|
cookie1 2015-04-10 10:50:01 url5
|
|
cookie2 2015-04-10 10:00:02 url22
|
|
cookie2 2015-04-10 10:00:00 url11
|
|
cookie2 2015-04-10 10:03:04 url33
|
|
cookie2 2015-04-10 10:50:05 url66
|
|
cookie2 2015-04-10 11:00:00 url77
|
|
cookie2 2015-04-10 10:10:00 url44
|
|
cookie2 2015-04-10 10:50:01 url55</pre>
|
|
<p id="dli_spark_lag__en-us_topic_0000001655381194_p699474020184">Groups all records by <strong id="dli_spark_lag__en-us_topic_0000001655381194_b14373914142820">cookieid</strong>, sorts the records by <strong id="dli_spark_lag__en-us_topic_0000001655381194_b1779711204288">createtime</strong> in ascending order, and returns the value of the second row above the window. An example command is as follows:</p>
|
|
<p id="dli_spark_lag__en-us_topic_0000001655381194_p72643226457">Example 1:</p>
|
|
<pre class="screen" id="dli_spark_lag__en-us_topic_0000001655381194_screen1443294714450">SELECT cookieid, createtime, url,
|
|
LAG(createtime, 2) OVER (PARTITION BY cookieid ORDER BY createtime) AS last_2_time
|
|
FROM logs;
|
|
-- Returned result:
|
|
cookieid createtime url last_2_time
|
|
cookie1 2015-04-10 10:00:00 url1 NULL
|
|
cookie1 2015-04-10 10:00:02 url2 NULL
|
|
cookie1 2015-04-10 10:03:04 url3 2015-04-10 10:00:00
|
|
cookie1 2015-04-10 10:10:00 url4 2015-04-10 10:00:02
|
|
cookie1 2015-04-10 10:50:01 url5 2015-04-10 10:03:04
|
|
cookie1 2015-04-10 10:50:05 url6 2015-04-10 10:10:00
|
|
cookie1 2015-04-10 11:00:00 url7 2015-04-10 10:50:01
|
|
cookie2 2015-04-10 10:00:00 url11 NULL
|
|
cookie2 2015-04-10 10:00:02 url22 NULL
|
|
cookie2 2015-04-10 10:03:04 url33 2015-04-10 10:00:00
|
|
cookie2 2015-04-10 10:10:00 url44 2015-04-10 10:00:02
|
|
cookie2 2015-04-10 10:50:01 url55 2015-04-10 10:03:04
|
|
cookie2 2015-04-10 10:50:05 url66 2015-04-10 10:10:00
|
|
cookie2 2015-04-10 11:00:00 url77 2015-04-10 10:50:01</pre>
|
|
<div class="note" id="dli_spark_lag__en-us_topic_0000001655381194_note4414045174619"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="dli_spark_lag__en-us_topic_0000001655381194_p1126410227458">Note: Because no default value is set, <strong id="dli_spark_lag__en-us_topic_0000001655381194_b20167356162815">NULL</strong> is returned when the preceding two rows do not exist.</p>
|
|
</div></div>
|
|
<p id="dli_spark_lag__en-us_topic_0000001655381194_p1626420220454">Example 2:</p>
|
|
<pre class="screen" id="dli_spark_lag__en-us_topic_0000001655381194_screen9944523124611">SELECT cookieid, createtime, url,
|
|
LAG(createtime,1,'1970-01-01 00:00:00') OVER (PARTITION BY cookieid ORDER BY createtime) AS last_1_time
|
|
FROM cookie4;
|
|
-- Result:
|
|
cookieid createtime url last_1_time
|
|
cookie1 2015-04-10 10:00:00 url1 1970-01-01 00:00:00 (The default value is displayed.)
|
|
cookie1 2015-04-10 10:00:02 url2 2015-04-10 10:00:00
|
|
cookie1 2015-04-10 10:03:04 url3 2015-04-10 10:00:02
|
|
cookie1 2015-04-10 10:10:00 url4 2015-04-10 10:03:04
|
|
cookie1 2015-04-10 10:50:01 url5 2015-04-10 10:10:00
|
|
cookie1 2015-04-10 10:50:05 url6 2015-04-10 10:50:01
|
|
cookie1 2015-04-10 11:00:00 url7 2015-04-10 10:50:05
|
|
cookie2 2015-04-10 10:00:00 url11 1970-01-01 00:00:00 (The default value is displayed.)
|
|
cookie2 2015-04-10 10:00:02 url22 2015-04-10 10:00:00
|
|
cookie2 2015-04-10 10:03:04 url33 2015-04-10 10:00:02
|
|
cookie2 2015-04-10 10:10:00 url44 2015-04-10 10:03:04
|
|
cookie2 2015-04-10 10:50:01 url55 2015-04-10 10:10:00
|
|
cookie2 2015-04-10 10:50:05 url66 2015-04-10 10:50:01
|
|
cookie2 2015-04-10 11:00:00 url77 2015-04-10 10:50:05
|
|
</pre>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dli_08_0475.html">Window Functions</a></div>
|
|
</div>
|
|
</div>
|
|
|