forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
185 lines
18 KiB
HTML
185 lines
18 KiB
HTML
<a name="mrs_01_2339"></a><a name="mrs_01_2339"></a>
|
|
|
|
<h1 class="topictitle1">HetuEngine Function Plugin Development and Application</h1>
|
|
<div id="body32001227"><p id="mrs_01_2339__en-us_topic_0000001173949734_p124035816151">You can customize functions to extend SQL statements to meet personalized requirements. These functions are called UDFs.</p>
|
|
<p id="mrs_01_2339__en-us_topic_0000001173949734_p08731157164015">This section describes how to develop and apply <span id="mrs_01_2339__en-us_topic_0000001173949734_text1782132315156">HetuEngine</span> function plugins.</p>
|
|
<div class="section" id="mrs_01_2339__en-us_topic_0000001173949734_section15835232192311"><h4 class="sectiontitle">Developing Function Plugins</h4><p id="mrs_01_2339__en-us_topic_0000001173949734_p1634031652410">This sample implements two function plugins described in the following table.</p>
|
|
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_2339__en-us_topic_0000001173949734_table48921452171810" frame="border" border="1" rules="all"><caption><b>Table 1 </b><span id="mrs_01_2339__en-us_topic_0000001173949734_text1024917591187">HetuEngine</span> function plugins</caption><thead align="left"><tr id="mrs_01_2339__en-us_topic_0000001173949734_row78931252151817"><th align="left" class="cellrowborder" valign="top" width="21.922192219221923%" id="mcps1.3.3.3.2.4.1.1"><p id="mrs_01_2339__en-us_topic_0000001173949734_p1117104181911"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b470572271010">Parameter</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="44.74447444744475%" id="mcps1.3.3.3.2.4.1.2"><p id="mrs_01_2339__en-us_topic_0000001173949734_p1311712419194"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b1488536112418">Description</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.3.3.2.4.1.3"><p id="mrs_01_2339__en-us_topic_0000001173949734_p11117743192"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b22751638172413">Type</strong></p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="mrs_01_2339__en-us_topic_0000001173949734_row4893952161815"><td class="cellrowborder" valign="top" width="21.922192219221923%" headers="mcps1.3.3.3.2.4.1.1 "><p id="mrs_01_2339__en-us_topic_0000001173949734_p14230151021915">add_two</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="44.74447444744475%" headers="mcps1.3.3.3.2.4.1.2 "><p id="mrs_01_2339__en-us_topic_0000001173949734_p82301710151914">Adds <strong id="mrs_01_2339__en-us_topic_0000001173949734_b9767161955017">2</strong> to the input integer and returns the result.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.3.3.2.4.1.3 "><p id="mrs_01_2339__en-us_topic_0000001173949734_p223071061914">ScalarFunction</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_2339__en-us_topic_0000001173949734_row20893165218180"><td class="cellrowborder" valign="top" width="21.922192219221923%" headers="mcps1.3.3.3.2.4.1.1 "><p id="mrs_01_2339__en-us_topic_0000001173949734_p14230131071910">avg_double</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="44.74447444744475%" headers="mcps1.3.3.3.2.4.1.2 "><p id="mrs_01_2339__en-us_topic_0000001173949734_p1323011081917">Aggregates and calculates the average value of a specified column. The field type of the column is <strong id="mrs_01_2339__en-us_topic_0000001173949734_b738616396017">double</strong>.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.3.3.2.4.1.3 "><p id="mrs_01_2339__en-us_topic_0000001173949734_p122301710181913">AggregationFunction</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
<ol id="mrs_01_2339__en-us_topic_0000001173949734_ol17107145732412"><li id="mrs_01_2339__en-us_topic_0000001173949734_li1910735752419"><span>Create a Maven project. Set <strong id="mrs_01_2339__en-us_topic_0000001173949734_b1575083692613">groupId</strong> to <strong id="mrs_01_2339__en-us_topic_0000001173949734_b3529143812265">com.test.udf</strong> and <strong id="mrs_01_2339__en-us_topic_0000001173949734_b13461341202620">artifactId</strong> to <strong id="mrs_01_2339__en-us_topic_0000001173949734_b116581843122613">udf-test</strong>. The two values can be customized based on the site requirements.</span></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li216614715258"><span>Modify the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b116405472614">pom.xml</strong> file as follows:</span><p><pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen8575112172616"><project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
|
|
<modelVersion>4.0.0</modelVersion>
|
|
<groupId>com.test.udf</groupId>
|
|
<artifactId>udf-test</artifactId>
|
|
<version>0.0.1-SNAPSHOT</version>
|
|
|
|
<packaging>hetu-plugin</packaging>
|
|
|
|
<dependencies>
|
|
<dependency>
|
|
<groupId>com.google.guava</groupId>
|
|
<artifactId>guava</artifactId>
|
|
<version>26.0-jre</version>
|
|
</dependency>
|
|
|
|
<dependency>
|
|
<groupId>io.hetu.core</groupId>
|
|
<artifactId>presto-spi</artifactId>
|
|
<version>1.2.0</version>
|
|
<scope>provided</scope>
|
|
</dependency>
|
|
|
|
</dependencies>
|
|
|
|
<build>
|
|
<plugins>
|
|
<plugin>
|
|
<groupId>org.apache.maven.plugins</groupId>
|
|
<artifactId>maven-assembly-plugin</artifactId>
|
|
<version>2.4.1</version>
|
|
<configuration>
|
|
<encoding>UTF-8</encoding>
|
|
</configuration>
|
|
</plugin>
|
|
<plugin>
|
|
<groupId>io.hetu</groupId>
|
|
<artifactId>presto-maven-plugin</artifactId>
|
|
<version>9</version>
|
|
<extensions>true</extensions>
|
|
</plugin>
|
|
</plugins>
|
|
</build>
|
|
</project></pre>
|
|
</p></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li12937184123518"><span>Create the implementation class of the function plugin.</span><p><div class="p" id="mrs_01_2339__en-us_topic_0000001173949734_p3146151153518">1. Create the function plugin implementation class <strong id="mrs_01_2339__en-us_topic_0000001173949734_b5930624182712">com.hadoop.other.TestUDF4</strong>. The code is as follows:<pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen153821947172615">public class TestUDF4 {
|
|
@ScalarFunction("add_two")
|
|
@SqlType(StandardTypes.INTEGER)
|
|
public static long add2(@SqlNullable @SqlType(StandardTypes.INTEGER) Long i) {
|
|
return i+2;
|
|
}</pre>
|
|
</div>
|
|
<div class="p" id="mrs_01_2339__en-us_topic_0000001173949734_p2010714491343">2. Create the function plugin implementation class <strong id="mrs_01_2339__en-us_topic_0000001173949734_b41411132112716">com.hadoop.other.AverageAggregation</strong>. The code is as follows:<pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen19288053113312">@AggregationFunction("avg_double")
|
|
public class AverageAggregation
|
|
{
|
|
@InputFunction
|
|
public static void input(
|
|
LongAndDoubleState state,
|
|
@SqlType(StandardTypes.DOUBLE) double value)
|
|
{
|
|
state.setLong(state.getLong() + 1);
|
|
state.setDouble(state.getDouble() + value);
|
|
}
|
|
|
|
@CombineFunction
|
|
public static void combine(
|
|
LongAndDoubleState state,
|
|
LongAndDoubleState otherState)
|
|
{
|
|
state.setLong(state.getLong() + otherState.getLong());
|
|
state.setDouble(state.getDouble() + otherState.getDouble());
|
|
}
|
|
|
|
@OutputFunction(StandardTypes.DOUBLE)
|
|
public static void output(LongAndDoubleState state, BlockBuilder out)
|
|
{
|
|
long count = state.getLong();
|
|
if (count == 0) {
|
|
out.appendNull();
|
|
}
|
|
else {
|
|
double value = state.getDouble();
|
|
DOUBLE.writeDouble(out, value / count);
|
|
}
|
|
}
|
|
}</pre>
|
|
</div>
|
|
</p></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li18502125612394"><span>Create the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b16251045182714">com.hadoop.other.LongAndDoubleState</strong> API on which <strong id="mrs_01_2339__en-us_topic_0000001173949734_b598525122718">AverageAggregation</strong> depends.</span><p><pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen62681047194315">public interface LongAndDoubleState extends AccumulatorState {
|
|
long getLong();
|
|
|
|
void setLong(long value);
|
|
|
|
double getDouble();
|
|
|
|
void setDouble(double value);
|
|
}</pre>
|
|
</p></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li10874171674216"><span>Create the function plugin registration class <strong id="mrs_01_2339__en-us_topic_0000001173949734_b189886587272">com.hadoop.other.RegisterFunctionTestPlugin</strong>. The code is as follows:</span><p><pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen106173814413">public class RegisterFunctionTestPlugin implements Plugin {
|
|
|
|
@Override
|
|
public Set<Class<?>> getFunctions() {
|
|
return ImmutableSet.<Class<?>>builder()
|
|
.add(TestUDF4.class)
|
|
.add(AverageAggregation.class)
|
|
.build();
|
|
}
|
|
}</pre>
|
|
</p></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li1181875812426"><span>Pack the Maven project and obtain the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b8649201212281">udf-test-0.0.1-SNAPSHOT</strong> directory in the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b124658191284">target</strong> directory. The following figure shows the overall structure of the project.</span><p><p id="mrs_01_2339__en-us_topic_0000001173949734_p13239145782"><span><img id="mrs_01_2339__en-us_topic_0000001173949734_image6771650983" src="en-us_image_0000001295740088.png"></span></p>
|
|
</p></li></ol>
|
|
</div>
|
|
<div class="section" id="mrs_01_2339__en-us_topic_0000001173949734_section1129611717017"><h4 class="sectiontitle">Deploying Function Plugins</h4><p id="mrs_01_2339__en-us_topic_0000001173949734_p3289175316570">Before the deployment, ensure that:</p>
|
|
<ul id="mrs_01_2339__en-us_topic_0000001173949734_ul638885665718"><li id="mrs_01_2339__en-us_topic_0000001173949734_li059814445815">The <span id="mrs_01_2339__en-us_topic_0000001173949734_text32081445131214">HetuEngine</span> service is normal.</li><li id="mrs_01_2339__en-us_topic_0000001173949734_li17388125614578">The HDFS and <span id="mrs_01_2339__en-us_topic_0000001173949734_text3726184820180">HetuEngine</span> client have been installed on the cluster node, for example, in the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b934820314307">/opt/</strong><strong id="mrs_01_2339__en-us_topic_0000001173949734_b1334843163020"></strong><strong id="mrs_01_2339__en-us_topic_0000001173949734_b183481131153013">client</strong> directory.</li><li id="mrs_01_2339__en-us_topic_0000001173949734_li2248685582">A <span id="mrs_01_2339__en-us_topic_0000001173949734_text10882818115810">HetuEngine</span> user has been created. For details about how to create a user, see <a href="mrs_01_1714.html">Creating a HetuEngine User</a>.</li></ul>
|
|
<ol id="mrs_01_2339__en-us_topic_0000001173949734_ol5571198938"><li id="mrs_01_2339__en-us_topic_0000001173949734_li134514253306"><span>Upload the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b206671219216">udf-test-0.0.1-SNAPSHOT</strong> directory obtained in packing the Maven project to any directory on the node where the client is installed.</span></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li557298839"><span>Upload the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b123951859441">udf-test-0.0.1-SNAPSHOT</strong> directory to HDFS.</span><p><ol type="a" id="mrs_01_2339__en-us_topic_0000001173949734_ol572705215720"><li id="mrs_01_2339__en-us_topic_0000001173949734_li6351510685">Log in to the node where the client is installed and perform security authentication.<p id="mrs_01_2339__en-us_topic_0000001173949734_p125142315265"><a name="mrs_01_2339__en-us_topic_0000001173949734_li6351510685"></a><a name="en-us_topic_0000001173949734_li6351510685"></a><strong id="mrs_01_2339__en-us_topic_0000001173949734_b1224415186298">cd /opt/client</strong></p>
|
|
<p id="mrs_01_2339__en-us_topic_0000001173949734_p1919513397238"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b1819453992315">source bigdata_env</strong></p>
|
|
<p id="mrs_01_2339__en-us_topic_0000001173949734_p318635032614"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b161471123172910">kinit </strong><em id="mrs_01_2339__en-us_topic_0000001173949734_i195221345153218">HetuEngine user</em></p>
|
|
<p id="mrs_01_2339__en-us_topic_0000001173949734_p197091523133319">Enter the password as prompted and change the password upon the first authentication.</p>
|
|
</li><li id="mrs_01_2339__en-us_topic_0000001173949734_li49877581186">Create the following paths in HDFS. If the paths already exist, skip this step.<p id="mrs_01_2339__en-us_topic_0000001173949734_p89975516297"><a name="mrs_01_2339__en-us_topic_0000001173949734_li49877581186"></a><a name="en-us_topic_0000001173949734_li49877581186"></a><strong id="mrs_01_2339__en-us_topic_0000001173949734_b87573568297">hdfs dfs -mkdir -p /user/hetuserver/udf/data/externalFunctionsPlugin</strong></p>
|
|
</li><li id="mrs_01_2339__en-us_topic_0000001173949734_li75982017691">Upload the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b164662011143218">udf-test-0.0.1-SNAPSHOT</strong> directory to HDFS.<p id="mrs_01_2339__en-us_topic_0000001173949734_p10763318133019"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b436412563010">hdfs dfs -put udf-test-0.0.1-SNAPSHOT /user/hetuserver/udf/data/externalFunctionsPlugin</strong></p>
|
|
</li><li id="mrs_01_2339__en-us_topic_0000001173949734_li1519719311992">Change the directory owner and owner group.<p id="mrs_01_2339__en-us_topic_0000001173949734_p1488622615315"><a name="mrs_01_2339__en-us_topic_0000001173949734_li1519719311992"></a><a name="en-us_topic_0000001173949734_li1519719311992"></a><strong id="mrs_01_2339__en-us_topic_0000001173949734_b0295123313312">hdfs dfs -chown -R hetuserver:hadoop /user/hetuserver/udf/data</strong></p>
|
|
</li></ol>
|
|
</p></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li13185112234"><span>Restart the <span id="mrs_01_2339__en-us_topic_0000001173949734_text580414234169">HetuEngine</span> compute instance.</span></li></ol>
|
|
</div>
|
|
<div class="section" id="mrs_01_2339__en-us_topic_0000001173949734_section1480282811211"><h4 class="sectiontitle">Verifying Function Plugins</h4><ol id="mrs_01_2339__en-us_topic_0000001173949734_ol17173511919"><li id="mrs_01_2339__en-us_topic_0000001173949734_li1071335592"><span>Log in to the node where the client is installed and perform security authentication.</span><p><p id="mrs_01_2339__en-us_topic_0000001173949734_p176194323918"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b561983210913">cd /opt/client</strong></p>
|
|
<p id="mrs_01_2339__en-us_topic_0000001173949734_p1620832399"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b862018321399">source bigdata_env</strong></p>
|
|
<p id="mrs_01_2339__en-us_topic_0000001173949734_p862012329916"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b1941911763312">kinit </strong><em id="mrs_01_2339__en-us_topic_0000001173949734_i242421783315">HetuEngine user</em></p>
|
|
<p id="mrs_01_2339__en-us_topic_0000001173949734_p69951917121014"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b16290175491013">hetu-cli --catalog hive --schema default</strong></p>
|
|
</p></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li43281340295"><span>Verify function plugins.</span><p><ol type="a" id="mrs_01_2339__en-us_topic_0000001173949734_ol62163762013"><li id="mrs_01_2339__en-us_topic_0000001173949734_li16619153817205">Query a table.<p id="mrs_01_2339__en-us_topic_0000001173949734_p9202134252019"><a name="mrs_01_2339__en-us_topic_0000001173949734_li16619153817205"></a><a name="en-us_topic_0000001173949734_li16619153817205"></a><strong id="mrs_01_2339__en-us_topic_0000001173949734_b12829145191318">select * from test1;</strong></p>
|
|
<pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen537115486135">select * from test1;
|
|
name | price
|
|
--------|-------
|
|
apple | 17.8
|
|
orange | 25.0
|
|
(2 rows)</pre>
|
|
</li><li id="mrs_01_2339__en-us_topic_0000001173949734_li68718132115">Return the average value.<p id="mrs_01_2339__en-us_topic_0000001173949734_p128634517147"><a name="mrs_01_2339__en-us_topic_0000001173949734_li68718132115"></a><a name="en-us_topic_0000001173949734_li68718132115"></a><strong id="mrs_01_2339__en-us_topic_0000001173949734_b16861457142">select avg_double(price) from test1;</strong></p>
|
|
<pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen13150124391514">select avg_double(price) from test1;
|
|
_col0
|
|
-------
|
|
21.4
|
|
(1 row)</pre>
|
|
</li><li id="mrs_01_2339__en-us_topic_0000001173949734_li19823122132117">Return the value of the input integer plus 2.<p id="mrs_01_2339__en-us_topic_0000001173949734_p09516256213"><a name="mrs_01_2339__en-us_topic_0000001173949734_li19823122132117"></a><a name="en-us_topic_0000001173949734_li19823122132117"></a><strong id="mrs_01_2339__en-us_topic_0000001173949734_b5565145181717">select add_two(4);</strong></p>
|
|
<pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen184834474223">select add_two(4);
|
|
_col0
|
|
-------
|
|
6
|
|
(1 row)</pre>
|
|
</li></ol>
|
|
</p></li></ol>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_2338.html">Function & UDF Development and Application</a></div>
|
|
</div>
|
|
</div>
|
|
|