Yang, Tong 3f5759eed2 MRS comp-lts 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2023-01-19 17:08:45 +00:00

185 lines
18 KiB
HTML

<a name="mrs_01_2339"></a><a name="mrs_01_2339"></a>
<h1 class="topictitle1">HetuEngine Function Plugin Development and Application</h1>
<div id="body32001227"><p id="mrs_01_2339__en-us_topic_0000001173949734_p124035816151">You can customize functions to extend SQL statements to meet personalized requirements. These functions are called UDFs.</p>
<p id="mrs_01_2339__en-us_topic_0000001173949734_p08731157164015">This section describes how to develop and apply <span id="mrs_01_2339__en-us_topic_0000001173949734_text1782132315156">HetuEngine</span> function plugins.</p>
<div class="section" id="mrs_01_2339__en-us_topic_0000001173949734_section15835232192311"><h4 class="sectiontitle">Developing Function Plugins</h4><p id="mrs_01_2339__en-us_topic_0000001173949734_p1634031652410">This sample implements two function plugins described in the following table.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_2339__en-us_topic_0000001173949734_table48921452171810" frame="border" border="1" rules="all"><caption><b>Table 1 </b><span id="mrs_01_2339__en-us_topic_0000001173949734_text1024917591187">HetuEngine</span> function plugins</caption><thead align="left"><tr id="mrs_01_2339__en-us_topic_0000001173949734_row78931252151817"><th align="left" class="cellrowborder" valign="top" width="21.922192219221923%" id="mcps1.3.3.3.2.4.1.1"><p id="mrs_01_2339__en-us_topic_0000001173949734_p1117104181911"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b470572271010">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="44.74447444744475%" id="mcps1.3.3.3.2.4.1.2"><p id="mrs_01_2339__en-us_topic_0000001173949734_p1311712419194"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b1488536112418">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.3.3.2.4.1.3"><p id="mrs_01_2339__en-us_topic_0000001173949734_p11117743192"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b22751638172413">Type</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_2339__en-us_topic_0000001173949734_row4893952161815"><td class="cellrowborder" valign="top" width="21.922192219221923%" headers="mcps1.3.3.3.2.4.1.1 "><p id="mrs_01_2339__en-us_topic_0000001173949734_p14230151021915">add_two</p>
</td>
<td class="cellrowborder" valign="top" width="44.74447444744475%" headers="mcps1.3.3.3.2.4.1.2 "><p id="mrs_01_2339__en-us_topic_0000001173949734_p82301710151914">Adds <strong id="mrs_01_2339__en-us_topic_0000001173949734_b9767161955017">2</strong> to the input integer and returns the result.</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.3.3.2.4.1.3 "><p id="mrs_01_2339__en-us_topic_0000001173949734_p223071061914">ScalarFunction</p>
</td>
</tr>
<tr id="mrs_01_2339__en-us_topic_0000001173949734_row20893165218180"><td class="cellrowborder" valign="top" width="21.922192219221923%" headers="mcps1.3.3.3.2.4.1.1 "><p id="mrs_01_2339__en-us_topic_0000001173949734_p14230131071910">avg_double</p>
</td>
<td class="cellrowborder" valign="top" width="44.74447444744475%" headers="mcps1.3.3.3.2.4.1.2 "><p id="mrs_01_2339__en-us_topic_0000001173949734_p1323011081917">Aggregates and calculates the average value of a specified column. The field type of the column is <strong id="mrs_01_2339__en-us_topic_0000001173949734_b738616396017">double</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.3.3.2.4.1.3 "><p id="mrs_01_2339__en-us_topic_0000001173949734_p122301710181913">AggregationFunction</p>
</td>
</tr>
</tbody>
</table>
</div>
<ol id="mrs_01_2339__en-us_topic_0000001173949734_ol17107145732412"><li id="mrs_01_2339__en-us_topic_0000001173949734_li1910735752419"><span>Create a Maven project. Set <strong id="mrs_01_2339__en-us_topic_0000001173949734_b1575083692613">groupId</strong> to <strong id="mrs_01_2339__en-us_topic_0000001173949734_b3529143812265">com.test.udf</strong> and <strong id="mrs_01_2339__en-us_topic_0000001173949734_b13461341202620">artifactId</strong> to <strong id="mrs_01_2339__en-us_topic_0000001173949734_b116581843122613">udf-test</strong>. The two values can be customized based on the site requirements.</span></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li216614715258"><span>Modify the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b116405472614">pom.xml</strong> file as follows:</span><p><pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen8575112172616">&lt;project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"&gt;
&lt;modelVersion&gt;4.0.0&lt;/modelVersion&gt;
&lt;groupId&gt;com.test.udf&lt;/groupId&gt;
&lt;artifactId&gt;udf-test&lt;/artifactId&gt;
&lt;version&gt;0.0.1-SNAPSHOT&lt;/version&gt;
&lt;packaging&gt;hetu-plugin&lt;/packaging&gt;
&lt;dependencies&gt;
&lt;dependency&gt;
&lt;groupId&gt;com.google.guava&lt;/groupId&gt;
&lt;artifactId&gt;guava&lt;/artifactId&gt;
&lt;version&gt;26.0-jre&lt;/version&gt;
&lt;/dependency&gt;
&lt;dependency&gt;
&lt;groupId&gt;io.hetu.core&lt;/groupId&gt;
&lt;artifactId&gt;presto-spi&lt;/artifactId&gt;
&lt;version&gt;1.2.0&lt;/version&gt;
&lt;scope&gt;provided&lt;/scope&gt;
&lt;/dependency&gt;
&lt;/dependencies&gt;
&lt;build&gt;
&lt;plugins&gt;
&lt;plugin&gt;
&lt;groupId&gt;org.apache.maven.plugins&lt;/groupId&gt;
&lt;artifactId&gt;maven-assembly-plugin&lt;/artifactId&gt;
&lt;version&gt;2.4.1&lt;/version&gt;
&lt;configuration&gt;
&lt;encoding&gt;UTF-8&lt;/encoding&gt;
&lt;/configuration&gt;
&lt;/plugin&gt;
&lt;plugin&gt;
&lt;groupId&gt;io.hetu&lt;/groupId&gt;
&lt;artifactId&gt;presto-maven-plugin&lt;/artifactId&gt;
&lt;version&gt;9&lt;/version&gt;
&lt;extensions&gt;true&lt;/extensions&gt;
&lt;/plugin&gt;
&lt;/plugins&gt;
&lt;/build&gt;
&lt;/project&gt;</pre>
</p></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li12937184123518"><span>Create the implementation class of the function plugin.</span><p><div class="p" id="mrs_01_2339__en-us_topic_0000001173949734_p3146151153518">1. Create the function plugin implementation class <strong id="mrs_01_2339__en-us_topic_0000001173949734_b5930624182712">com.hadoop.other.TestUDF4</strong>. The code is as follows:<pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen153821947172615">public class TestUDF4 {
@ScalarFunction("add_two")
@SqlType(StandardTypes.INTEGER)
public static long add2(@SqlNullable @SqlType(StandardTypes.INTEGER) Long i) {
return i+2;
}</pre>
</div>
<div class="p" id="mrs_01_2339__en-us_topic_0000001173949734_p2010714491343">2. Create the function plugin implementation class <strong id="mrs_01_2339__en-us_topic_0000001173949734_b41411132112716">com.hadoop.other.AverageAggregation</strong>. The code is as follows:<pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen19288053113312">@AggregationFunction("avg_double")
public class AverageAggregation
{
@InputFunction
public static void input(
LongAndDoubleState state,
@SqlType(StandardTypes.DOUBLE) double value)
{
state.setLong(state.getLong() + 1);
state.setDouble(state.getDouble() + value);
}
@CombineFunction
public static void combine(
LongAndDoubleState state,
LongAndDoubleState otherState)
{
state.setLong(state.getLong() + otherState.getLong());
state.setDouble(state.getDouble() + otherState.getDouble());
}
@OutputFunction(StandardTypes.DOUBLE)
public static void output(LongAndDoubleState state, BlockBuilder out)
{
long count = state.getLong();
if (count == 0) {
out.appendNull();
}
else {
double value = state.getDouble();
DOUBLE.writeDouble(out, value / count);
}
}
}</pre>
</div>
</p></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li18502125612394"><span>Create the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b16251045182714">com.hadoop.other.LongAndDoubleState</strong> API on which <strong id="mrs_01_2339__en-us_topic_0000001173949734_b598525122718">AverageAggregation</strong> depends.</span><p><pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen62681047194315">public interface LongAndDoubleState extends AccumulatorState {
long getLong();
void setLong(long value);
double getDouble();
void setDouble(double value);
}</pre>
</p></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li10874171674216"><span>Create the function plugin registration class <strong id="mrs_01_2339__en-us_topic_0000001173949734_b189886587272">com.hadoop.other.RegisterFunctionTestPlugin</strong>. The code is as follows:</span><p><pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen106173814413">public class RegisterFunctionTestPlugin implements Plugin {
@Override
public Set&lt;Class&lt;?&gt;&gt; getFunctions() {
return ImmutableSet.&lt;Class&lt;?&gt;&gt;builder()
.add(TestUDF4.class)
.add(AverageAggregation.class)
.build();
}
}</pre>
</p></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li1181875812426"><span>Pack the Maven project and obtain the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b8649201212281">udf-test-0.0.1-SNAPSHOT</strong> directory in the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b124658191284">target</strong> directory. The following figure shows the overall structure of the project.</span><p><p id="mrs_01_2339__en-us_topic_0000001173949734_p13239145782"><span><img id="mrs_01_2339__en-us_topic_0000001173949734_image6771650983" src="en-us_image_0000001295740088.png"></span></p>
</p></li></ol>
</div>
<div class="section" id="mrs_01_2339__en-us_topic_0000001173949734_section1129611717017"><h4 class="sectiontitle">Deploying Function Plugins</h4><p id="mrs_01_2339__en-us_topic_0000001173949734_p3289175316570">Before the deployment, ensure that:</p>
<ul id="mrs_01_2339__en-us_topic_0000001173949734_ul638885665718"><li id="mrs_01_2339__en-us_topic_0000001173949734_li059814445815">The <span id="mrs_01_2339__en-us_topic_0000001173949734_text32081445131214">HetuEngine</span> service is normal.</li><li id="mrs_01_2339__en-us_topic_0000001173949734_li17388125614578">The HDFS and <span id="mrs_01_2339__en-us_topic_0000001173949734_text3726184820180">HetuEngine</span> client have been installed on the cluster node, for example, in the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b934820314307">/opt/</strong><strong id="mrs_01_2339__en-us_topic_0000001173949734_b1334843163020"></strong><strong id="mrs_01_2339__en-us_topic_0000001173949734_b183481131153013">client</strong> directory.</li><li id="mrs_01_2339__en-us_topic_0000001173949734_li2248685582">A <span id="mrs_01_2339__en-us_topic_0000001173949734_text10882818115810">HetuEngine</span> user has been created. For details about how to create a user, see <a href="mrs_01_1714.html">Creating a HetuEngine User</a>.</li></ul>
<ol id="mrs_01_2339__en-us_topic_0000001173949734_ol5571198938"><li id="mrs_01_2339__en-us_topic_0000001173949734_li134514253306"><span>Upload the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b206671219216">udf-test-0.0.1-SNAPSHOT</strong> directory obtained in packing the Maven project to any directory on the node where the client is installed.</span></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li557298839"><span>Upload the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b123951859441">udf-test-0.0.1-SNAPSHOT</strong> directory to HDFS.</span><p><ol type="a" id="mrs_01_2339__en-us_topic_0000001173949734_ol572705215720"><li id="mrs_01_2339__en-us_topic_0000001173949734_li6351510685">Log in to the node where the client is installed and perform security authentication.<p id="mrs_01_2339__en-us_topic_0000001173949734_p125142315265"><a name="mrs_01_2339__en-us_topic_0000001173949734_li6351510685"></a><a name="en-us_topic_0000001173949734_li6351510685"></a><strong id="mrs_01_2339__en-us_topic_0000001173949734_b1224415186298">cd /opt/client</strong></p>
<p id="mrs_01_2339__en-us_topic_0000001173949734_p1919513397238"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b1819453992315">source bigdata_env</strong></p>
<p id="mrs_01_2339__en-us_topic_0000001173949734_p318635032614"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b161471123172910">kinit </strong><em id="mrs_01_2339__en-us_topic_0000001173949734_i195221345153218">HetuEngine user</em></p>
<p id="mrs_01_2339__en-us_topic_0000001173949734_p197091523133319">Enter the password as prompted and change the password upon the first authentication.</p>
</li><li id="mrs_01_2339__en-us_topic_0000001173949734_li49877581186">Create the following paths in HDFS. If the paths already exist, skip this step.<p id="mrs_01_2339__en-us_topic_0000001173949734_p89975516297"><a name="mrs_01_2339__en-us_topic_0000001173949734_li49877581186"></a><a name="en-us_topic_0000001173949734_li49877581186"></a><strong id="mrs_01_2339__en-us_topic_0000001173949734_b87573568297">hdfs dfs -mkdir -p /user/hetuserver/udf/data/externalFunctionsPlugin</strong></p>
</li><li id="mrs_01_2339__en-us_topic_0000001173949734_li75982017691">Upload the <strong id="mrs_01_2339__en-us_topic_0000001173949734_b164662011143218">udf-test-0.0.1-SNAPSHOT</strong> directory to HDFS.<p id="mrs_01_2339__en-us_topic_0000001173949734_p10763318133019"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b436412563010">hdfs dfs -put udf-test-0.0.1-SNAPSHOT /user/hetuserver/udf/data/externalFunctionsPlugin</strong></p>
</li><li id="mrs_01_2339__en-us_topic_0000001173949734_li1519719311992">Change the directory owner and owner group.<p id="mrs_01_2339__en-us_topic_0000001173949734_p1488622615315"><a name="mrs_01_2339__en-us_topic_0000001173949734_li1519719311992"></a><a name="en-us_topic_0000001173949734_li1519719311992"></a><strong id="mrs_01_2339__en-us_topic_0000001173949734_b0295123313312">hdfs dfs -chown -R hetuserver:hadoop /user/hetuserver/udf/data</strong></p>
</li></ol>
</p></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li13185112234"><span>Restart the <span id="mrs_01_2339__en-us_topic_0000001173949734_text580414234169">HetuEngine</span> compute instance.</span></li></ol>
</div>
<div class="section" id="mrs_01_2339__en-us_topic_0000001173949734_section1480282811211"><h4 class="sectiontitle">Verifying Function Plugins</h4><ol id="mrs_01_2339__en-us_topic_0000001173949734_ol17173511919"><li id="mrs_01_2339__en-us_topic_0000001173949734_li1071335592"><span>Log in to the node where the client is installed and perform security authentication.</span><p><p id="mrs_01_2339__en-us_topic_0000001173949734_p176194323918"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b561983210913">cd /opt/client</strong></p>
<p id="mrs_01_2339__en-us_topic_0000001173949734_p1620832399"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b862018321399">source bigdata_env</strong></p>
<p id="mrs_01_2339__en-us_topic_0000001173949734_p862012329916"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b1941911763312">kinit </strong><em id="mrs_01_2339__en-us_topic_0000001173949734_i242421783315">HetuEngine user</em></p>
<p id="mrs_01_2339__en-us_topic_0000001173949734_p69951917121014"><strong id="mrs_01_2339__en-us_topic_0000001173949734_b16290175491013">hetu-cli --catalog hive --schema default</strong></p>
</p></li><li id="mrs_01_2339__en-us_topic_0000001173949734_li43281340295"><span>Verify function plugins.</span><p><ol type="a" id="mrs_01_2339__en-us_topic_0000001173949734_ol62163762013"><li id="mrs_01_2339__en-us_topic_0000001173949734_li16619153817205">Query a table.<p id="mrs_01_2339__en-us_topic_0000001173949734_p9202134252019"><a name="mrs_01_2339__en-us_topic_0000001173949734_li16619153817205"></a><a name="en-us_topic_0000001173949734_li16619153817205"></a><strong id="mrs_01_2339__en-us_topic_0000001173949734_b12829145191318">select * from test1;</strong></p>
<pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen537115486135">select * from test1;
name | price
--------|-------
apple | 17.8
orange | 25.0
(2 rows)</pre>
</li><li id="mrs_01_2339__en-us_topic_0000001173949734_li68718132115">Return the average value.<p id="mrs_01_2339__en-us_topic_0000001173949734_p128634517147"><a name="mrs_01_2339__en-us_topic_0000001173949734_li68718132115"></a><a name="en-us_topic_0000001173949734_li68718132115"></a><strong id="mrs_01_2339__en-us_topic_0000001173949734_b16861457142">select avg_double(price) from test1;</strong></p>
<pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen13150124391514">select avg_double(price) from test1;
_col0
-------
21.4
(1 row)</pre>
</li><li id="mrs_01_2339__en-us_topic_0000001173949734_li19823122132117">Return the value of the input integer plus 2.<p id="mrs_01_2339__en-us_topic_0000001173949734_p09516256213"><a name="mrs_01_2339__en-us_topic_0000001173949734_li19823122132117"></a><a name="en-us_topic_0000001173949734_li19823122132117"></a><strong id="mrs_01_2339__en-us_topic_0000001173949734_b5565145181717">select add_two(4);</strong></p>
<pre class="screen" id="mrs_01_2339__en-us_topic_0000001173949734_screen184834474223">select add_two(4);
_col0
-------
6
(1 row)</pre>
</li></ol>
</p></li></ol>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_2338.html">Function &amp; UDF Development and Application</a></div>
</div>
</div>