You can customize functions to extend SQL statements to meet personalized requirements. These functions are called UDFs.
This section describes how to develop and apply Hive UDFs.
This sample implements one Hive UDF described in the following table.
Parameter |
Description |
---|---|
AutoAddOne |
Adds 1 to the input value and returns the result. |
UDFs, UDAFs, and UDTFs currently do not support complex data types other than the preceding ones.
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.test.udf</groupId> <artifactId>udf-test</artifactId> <version>0.0.1-SNAPSHOT</version> <dependencies> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>3.1.1</version> </dependency> </dependencies> <build> <plugins> <plugin> <artifactId>maven-shade-plugin</artifactId> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> </execution> </executions> </plugin> <plugin> <artifactId>maven-resources-plugin</artifactId> <executions> <execution> <id>copy-resources</id> <phase>package</phase> <goals> <goal>copy-resources</goal> </goals> <configuration> <outputDirectory>${project.build.directory}/</outputDirectory> <resources> <resource> <directory>src/main/resources/</directory> <filtering>false</filtering> </resource> </resources> </configuration> </execution> </executions> </plugin> </plugins> </build> </project>
import org.apache.hadoop.hive.ql.exec.UDF; /** * AutoAddOne * * @since 2020-08-24 */ public class AutoAddOne extends UDF { public int evaluate(int data) { return data + 1; } }
In configuration file udf.properties, add registration information in the "Function_name Class_path" format to each line.
The following provides an example of registering four Hive UDFs in configuration file udf.properties:
booleanudf io.hetu.core.hive.dynamicfunctions.examples.udf.BooleanUDF shortudf io.hetu.core.hive.dynamicfunctions.examples.udf.ShortUDF byteudf io.hetu.core.hive.dynamicfunctions.examples.udf.ByteUDF intudf io.hetu.core.hive.dynamicfunctions.examples.udf.IntUDF
To use an existing Hive UDF in HetuEngine, you need to upload the UDF function package, udf.properties file, and configuration file on which the UDF depends to the specified HDFS directory, for example, /user/hetuserver/udf/, and restart the HetuEngine compute instance.
cd /opt/client
Enter the password as prompted.
hdfs dfs -mkdir /user/hetuserver/udf/data/externalFunctions
hdfs dfs -put ./Configuration files on which the UDF depends /user/hetuserver/udf/data
hdfs dfs -put ./udf.properties /user/hetuserver/udf
hdfs dfs -put ./UDF function package /user/hetuserver/udf/data/externalFunctions
Use a client to access a Hive UDF:
select AutoAddOne(1); _col0 ------- 2 (1 row)