【HIVE】大数据框架hive——自定义函数（UDF）_maxcompute udf如何传入多个参数-CSDN博客

本文介绍了Hive中的三种自定义函数类型：UDF（一对一）、UDAT（多对一）和UDTF（一对多），并详细讲解了如何实现自定义UDF，包括继承UDF类、规定evaluate方法等步骤。同时，提供了实现UDF的流程，包括打包、添加到ClassPath、创建函数和测试等环节。

hive 中的自定义函数类型

UDF（User-defined function）

一对一：
传递一个参数，然后对应一个值，例如：substring

UDAT (Aggregate Functions)

多对一：
传递多个参数，返回一个参数例如聚合函数：max
通常group by连用

UDTF（Table-Generating Functions ）

一对多：
传入1个值，返回多个值例如：
ip:省市区
2293@qq.com :QQ 号，邮箱类型

实现自定义函数 ——UDF

（1）继承于 UDF 类
（2）方法规定（摘自hive官网中文档）
a、Implement one or more methods named evaluate：必须实现 evaluate 方法（方法名为evaluate）
b、evaluate should never be a void method.：返回值类型不能为空
c、方法的参数和返回值的类型：Java 类型或者 Hadoop 类型。
推荐使用 Hadoop 类型，这样在 hive 中执行自定义函数时，就不需要对数据类型进行转换
速度可以更快

实例:

import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

/**
 * @author 飞
 * @create 
 * 去除字符串中双引号
 */

public class quoteUDF extends UDF{
    public Text evaluate (Text ip){
        String value = ip.toString();
        if (StringUtils.isBlank(value)){
            return new Text("无ip");
        }
        String ip1 = value.replace("\"","");
        return new Text(ip1);
    }
	//测试
    public static void main(String[] args) {
        Text text = new Text( "\"27.38.5.159\"" );
        System.out.println(new quoteUDF().evaluate(text));
    }
}

package com.huadian.hive_UDF;

import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.Locale;

/**
 * @author 飞
 * @create 
 * 转换英制日期时间格式为自定义格式
 */

public class TransUDF extends UDF {
    //英制时间格式
    SimpleDateFormat in = new SimpleDateFormat("dd/MMM/yyyy:HH:mm:ss ZZZZZ", Locale.US);
    //转成自定义格式
    SimpleDateFormat out = new SimpleDateFormat("yyyyMMddHHmmss");

    public Text evaluate(Text datetime_str) throws ParseException {
        String value = datetime_str.toString();
        if (StringUtils.isBlank(value)) {
            return new Text("无时间");
        }
        String s = value.replace("\"","");
        Date date = in.parse(s);
        String format = out.format(date);
        return new Text(format);
    }
//测试
    public static void main(String[] args) throws ParseException {
        Text text = new Text("\"31/Aug/2015:00:04:37 +0800\"");
        System.out.println(new TransUDF().evaluate(text));
    }
}

流程：

a）打成 Jar 包 mvn package
b) 将 Jar 包添加到 ClassPath 下面

 create function db_evaluate.quote_udf as 'com.huadian.hive_UDF.quoteUDF';

c）创建函数

create function db_evaluate.quote_udf as 'com.huadian.hive_UDF.quoteUDF';

d ) 测试

SELECT 
            quote_udf(ip) AS ip,
            trans_udf(datetime_str) AS date_str
        FROM 
            tb_udf LIMIT 5

e）删除自定义函数

DROP FUNCTION db_evaluate.quote_udf;