Hive函数大全

[TOC]

Relational Operators：关系运算符

=

Operator	OperandTypes	Description
A = B	所有基本类型	如果表达式 a 等于表达式 b，则为 TRUE，否则为 FALSE。

0: jdbc:hive2://192.168.150.150:10000> select (1=3);
+--------+
|  _c0   |
+--------+
| false  |
+--------+
1 row selected (0.484 seconds)

Mathematical Functions：数学函数

在 Hive 中支持一下内置的数学函数；当参数为 NULL 时，大多数返回 NULL 。

round()

ReturnType	Name(Signature)	Description
Double	round(Double a)	返回 a 的四舍五入后 BigInt 的值
Double	round(Double a, Int d)	返回 a 四舍五入到小数点后 d 位的值。

0: jdbc:hive2://192.168.150.150:10000> select round(pi());
+------+
| _c0  |
+------+
| 3.0  |
+------+
1 row selected (0.085 seconds)
0: jdbc:hive2://192.168.150.150:10000> select round(pi(),2);
+-------+
|  _c0  |
+-------+
| 3.14  |
+-------+
1 row selected (0.478 seconds)

bround()

ReturnType	Name(Signature)	Description
Double	bround(Double a )	返回向最接近数字方向舍入的舍入模式，如果与两个相邻数字的距离相等，则向相邻的偶数舍入。如果舍弃部分左边的数字为奇数，则左边数字加1，如果为偶数，则左边数字不变。【也称为高斯舍入或银行家的舍入】
Double	bround(Double a , Int d)	返回 a 四舍五入到小数点后 d 位的值。

0: jdbc:hive2://192.168.150.150:10000> select bround(3.5);
+------+
| _c0  |
+------+
| 4    |
+------+
1 row selected (0.086 seconds)
0: jdbc:hive2://192.168.150.150:10000> select bround(2.5);
+------+
| _c0  |
+------+
| 2    |
+------+
1 row selected (0.076 seconds)
0: jdbc:hive2://192.168.150.150:10000> select bround(pi(), 5);
+----------+
|   _c0    |
+----------+
| 3.14159  |
+----------+
1 row selected (0.07 seconds)

floor()/ceil()

ReturnType	Name(Signature)	Description
BigInt	floor(Double a)	返回 a 向下取整的数值
BigInt	ceil(Double a)	返回 a 向上取整的数值

-----------------------------------------floor(Double)---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select floor(pi());
+------+
| _c0  |
+------+
| 3    |
+------+
1 row selected (0.091 seconds)
-----------------------------------------ceil(Double)---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select ceil(pi());
+------+
| _c0  |
+------+
| 4    |
+------+
1 row selected (0.078 seconds)

rand()

ReturnType	Name(Signature)	Description
Double	rand()	返回一个从0到1均匀分布的随机数(从一行到另一行不断变化)
Double	rand(Int seed)	指定 seed 将确保生成的随机数序列是确定的。

0: jdbc:hive2://192.168.150.150:10000> select rand();
+----------------------+
|         _c0          |
+----------------------+
| 0.14502120180326072  |
+----------------------+
1 row selected (0.087 seconds)
0: jdbc:hive2://192.168.150.150:10000> select rand(13);
+---------------------+
|         _c0         |
+---------------------+
| 0.7298032243379924  |
+---------------------+
1 row selected (0.072 seconds)
0: jdbc:hive2://192.168.150.150:10000> select rand(13);
+---------------------+
|         _c0         |
+---------------------+
| 0.7298032243379924  |
+---------------------+
1 row selected (0.489 seconds)

exp()

ReturnType	Name(Signature)	Description
Double	exp(Double a), exp(Decimal a)	返回e为底，a为指数，幂运算后的数值【其中 e 是自然对数的基数，在 Hive 0.13.0中添加了十进制版本。】

0: jdbc:hive2://192.168.150.150:10000> select exp(2);
+-------------------+
|        _c0        |
+-------------------+
| 7.38905609893065  |
+-------------------+
1 row selected (0.074 seconds)
0: jdbc:hive2://192.168.150.150:10000> select e() * e();
+---------------------+
|         _c0         |
+---------------------+
| 7.3890560989306495  |
+---------------------+
1 row selected (0.473 seconds)

log()

ReturnType	Name(Signature)	Description
Double	ln(Double a), ln(Decimal a)	返回参数的自然对数值 a
Double	log10(Double a), log2(Decimal a)	返回参数 a 的以10为底的对数
Double	log2(Double a), log2(Decimal a)	返回参数的以2为底的对数
Double	log(Double/Decimal base, Double/Decimal a)	返回参数的以为底的对数

pow()

ReturnType	Name(Signature)	Description
Double	pow(Double a , Double p)	返回 a 为底数，p为指数的幂运算数值

0: jdbc:hive2://192.168.150.150:10000> select pow(2,3);
+------+
| _c0  |
+------+
| 8.0  |
+------+
1 row selected (0.574 seconds)
0: jdbc:hive2://192.168.150.150:10000> select pow(3,3);
+-------+
|  _c0  |
+-------+
| 27.0  |
+-------+
1 row selected (0.071 seconds)

sqrt()

ReturnType	Name(Signature)	Description
Double	sqrt(Double/Decimal a )	返回 a.Decimal 的平方根数值

0: jdbc:hive2://192.168.150.150:10000> select sqrt(3);
+---------------------+
|         _c0         |
+---------------------+
| 1.7320508075688772  |
+---------------------+
1 row selected (0.073 seconds)
0: jdbc:hive2://192.168.150.150:10000> select sqrt(4);
+------+
| _c0  |
+------+
| 2.0  |
+------+
1 row selected (0.077 seconds)

bin()/hex()/unhex()/conv()

ReturnType	Name(Signature)	Description
String	bin(BigInt a)	返回二进制格式的数字
String	hex( BigInt / String / Binary a)	如果参数为Int或者为二进制(Binary),返回数字 a 的16进制格式 \| 如果参数为String，则它将每个字符转换为其十六进制表示形式，并返回结果 STRING
Binary	unhex( String a)	反转十六进制。将每对字符解释为十六进制数，并将其转换为该数字的字节表示形式。
String	conv(BigInt/String num, Int from_base, Int to_base)	在不同的数字基数之间转换数字。返回数字 n 的字符串表示形式.【最小基数为2，最大基数为36。如果 from_ base 是负数，则 n 被视为有符号数。否则，n 被视为无符号，conv()以64位精度转换。】

----------------------------------------bin----------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select bin(4);
+------+
| _c0  |
+------+
| 100  |
+------+
1 row selected (0.483 seconds)
0: jdbc:hive2://192.168.150.150:10000> select bin(9);
+-------+
|  _c0  |
+-------+
| 1001  |
+-------+
1 row selected (0.071 seconds)
-----------------------------------------hex(Int)---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select hex(14);
+------+
| _c0  |
+------+
| E    |
+------+
1 row selected (0.468 seconds)
0: jdbc:hive2://192.168.150.150:10000> select hex(19);
+------+
| _c0  |
+------+
| 13   |
+------+
1 row selected (0.075 seconds)
-----------------------------------------hex(String)---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select hex('LOVE');
+-----------+
|    _c0    |
+-----------+
| 4C4F5645  |
+-----------+
1 row selected (0.087 seconds)
0: jdbc:hive2://192.168.150.150:10000> select hex('L');
+------+
| _c0  |
+------+
| 4C   |
+------+
1 row selected (0.47 seconds)
-----------------------------------------unhex(String)---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select unhex('4C4F5645');
+-------+
|  _c0  |
+-------+
| LOVE  |
+-------+
1 row selected (0.076 seconds)
0: jdbc:hive2://192.168.150.150:10000> select unhex('4C');
+------+
| _c0  |
+------+
| L    |
+------+
1 row selected (0.144 seconds)
-----------------------------------------conv(String ...)---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select conv('a', 16,2);
+-------+
|  _c0  |
+-------+
| 1010  |
+-------+
1 row selected (0.143 seconds)
0: jdbc:hive2://192.168.150.150:10000> select conv('6E', 18,8);
+------+
| _c0  |
+------+
| 172  |
+------+
1 row selected (0.092 seconds)
-----------------------------------------conv(Int ...)---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select conv(-17, 18,-8);
+------+
| _c0  |
+------+
| -31  |
+------+
1 row selected (0.08 seconds)
-----------------------------------------conv(Int|String ...)---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select conv(10+'10'+'10',10,10);
+------+
| _c0  |
+------+
| 30   |
+------+
1 row selected (0.076 seconds)

abs()

ReturnType	Name(Signature)	Description
Double	abs(Double a)	返回 a 的绝对值

0: jdbc:hive2://192.168.150.150:10000> select abs(-10);
+------+
| _c0  |
+------+
| 10   |
+------+
1 row selected (0.101 seconds)
0: jdbc:hive2://192.168.150.150:10000> select abs(10);
+------+
| _c0  |
+------+
| 10   |
+------+
1 row selected (0.085 seconds)

pmod()

ReturnType	Name(Signature)	Description
Int/Double	pmod(Int/Double a, Int/Double b)	返回 a 余 b 的结果

1 row selected (0.092 seconds)
0: jdbc:hive2://192.168.150.150:10000> select pmod(10, 5);
+------+
| _c0  |
+------+
| 0    |
+------+
1 row selected (0.503 seconds)
0: jdbc:hive2://192.168.150.150:10000> select pmod(-10, 5);
+------+
| _c0  |
+------+
| 0    |
+------+
1 row selected (0.084 seconds)
0: jdbc:hive2://192.168.150.150:10000> select pmod(-8, 5);
+------+
| _c0  |
+------+
| 2    |
+------+
1 row selected (0.086 seconds)

—–[反]三角函数：sin() / asin() / cos() / acos() / tan() / atan()

ReturnType	Name(Signature)	Description
Double	sin(Double / Decimal a)	返回 a 的正弦值【a 以弧度表示】
Double	asin(Double / Decimal a)
Double	cos(Double / Decimal a)
Double	acos(Double / Decimal a)
Double	tan(Double / Decimal a)
Double	atan(Double / Decimal a)

degrees()/radians()

ReturnType	Name(Signature)	Description
Double	degrees( Double / Decimal a)	返回将 a 的值从弧度转换为度数【弧度>>>度数】
Double	radians( Double / Decimal a)	返回将 a 的值从度数转换为弧度【度数>>>弧度】

-----------------------------------------degrees()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select degrees(60);
+---------------------+
|         _c0         |
+---------------------+
| 3437.7467707849396  |
+---------------------+
1 row selected (0.506 seconds)
0: jdbc:hive2://192.168.150.150:10000> select degrees(10);
+--------------------+
|        _c0         |
+--------------------+
| 572.9577951308232  |
+--------------------+
1 row selected (0.107 seconds)
-----------------------------------------radians()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select radians(10);
+----------------------+
|         _c0          |
+----------------------+
| 0.17453292519943295  |
+----------------------+
1 row selected (0.079 seconds)
0: jdbc:hive2://192.168.150.150:10000> select radians(60);
+---------------------+
|         _c0         |
+---------------------+
| 1.0471975511965976  |
+---------------------+
1 row selected (0.063 seconds)

positive()

ReturnType	Name(Signature)	Description
Int / Double	positive( Int / Double a)	返回数值 a

0: jdbc:hive2://192.168.150.150:10000> select positive(60);
+------+
| _c0  |
+------+
| 60   |
+------+
1 row selected (0.083 seconds)
0: jdbc:hive2://192.168.150.150:10000> select positive(10);
+------+
| _c0  |
+------+
| 10   |
+------+
1 row selected (0.072 seconds)

negative()

ReturnType	Name(Signature)	Description
Int / Double	negative(Int / Double a)	返回数值 a 的相反数

0: jdbc:hive2://192.168.150.150:10000> select negative(10);
+------+
| _c0  |
+------+
| -10  |
+------+
1 row selected (0.077 seconds)
0: jdbc:hive2://192.168.150.150:10000> select negative(-10);
+------+
| _c0  |
+------+
| 10   |
+------+
1 row selected (0.498 seconds)

sign()

ReturnType	Name(Signature)	Description
Double / Int	sign(Double / Decimal a)	返回1.0，则表示 a 为正数\| 返回-1.0，则表示 a 为负数

0: jdbc:hive2://192.168.150.150:10000> select sign(-10);
+-------+
|  _c0  |
+-------+
| -1.0  |
+-------+
1 row selected (0.526 seconds)
0: jdbc:hive2://192.168.150.150:10000> select sign(10);
+------+
| _c0  |
+------+
| 1.0  |
+------+
1 row selected (0.077 seconds)

e()

ReturnType	Name(Signature)	Description
Double	e()	返回自然数 e 的值

0: jdbc:hive2://192.168.150.150:10000> select e();
+--------------------+
|        _c0         |
+--------------------+
| 2.718281828459045  |
+--------------------+
1 row selected (0.097 seconds)

pi()

Operator	OperandTypes	Description
Double	pi()	返回圆周率 π 的值

0: jdbc:hive2://192.168.150.150:10000> select pi();
+--------------------+
|        _c0         |
+--------------------+
| 3.141592653589793  |
+--------------------+
1 row selected (0.487 seconds)

factorial()

Operator	OperandTypes	Description
BigInt	factorial( Int a)	返回 a 的阶乘【a 值的有效返回为[0…20]】

0: jdbc:hive2://192.168.150.150:10000> select factorial(5);
+------+
| _c0  |
+------+
| 120  |
+------+
1 row selected (0.117 seconds)
0: jdbc:hive2://192.168.150.150:10000> select factorial(20);
+----------------------+
|         _c0          |
+----------------------+
| 2432902008176640000  |
+----------------------+
1 row selected (0.104 seconds)

cbrt()

Operator	OperandTypes	Description
Double	cbrt( Double a)	返回 a 的立方根

0: jdbc:hive2://192.168.150.150:10000> select cbrt(8);
+------+
| _c0  |
+------+
| 2.0  |
+------+
1 row selected (0.485 seconds)
0: jdbc:hive2://192.168.150.150:10000> select cbrt(27);
+------+
| _c0  |
+------+
| 3.0  |
+------+
1 row selected (0.087 seconds)

shiftleft() / shiftright() / shiftrightunsugned()

Operator	OperandTypes	Description
Int \| BigInt	shiftleft( TinyInt \| SmallInt \| Int a, Int b) \| shiftleft( BigInt a, Int b)	返回 a 向左位移 b 次的数值
Int \| BigInt	shiftright( TinyInt \| SmallInt \| Int a, Int b) \| shiftright(TinyInt \| SmallInt \| Int a, Int b)	返回 a 向右位移 b 次的数值
Int \| BigInt	shiftrightunsigned( TinyInt \| SmallInt \| Int a, Int b) \| shiftrightunsigned(TinyInt \| SmallInt \| Int a, Int b)	返回 a 无符号向右位移 b 次的数值

-----------------------------------------shiftleft()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select shiftleft(2,1);
+------+
| _c0  |
+------+
| 4    |
+------+
1 row selected (0.11 seconds)
0: jdbc:hive2://192.168.150.150:10000> select shiftleft(2,3);
+------+
| _c0  |
+------+
| 16   |
+------+
1 row selected (0.519 seconds)
-----------------------------------------shiftright()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select shiftright(16,3);
+------+
| _c0  |
+------+
| 2    |
+------+
1 row selected (0.089 seconds)
0: jdbc:hive2://192.168.150.150:10000> select shiftright(4,1);
+------+
| _c0  |
+------+
| 2    |
+------+
1 row selected (0.075 seconds)

-----------------------------------------shiftrightunsigned()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select shiftrightunsigned(16,3);
+------+
| _c0  |
+------+
| 2    |
+------+
1 row selected (0.479 seconds)
0: jdbc:hive2://192.168.150.150:10000> select shiftrightunsigned(4,1);
+------+
| _c0  |
+------+
| 2    |
+------+
1 row selected (0.105 seconds)

greatest() / least()

Operator	OperandTypes	Description
T	greatest(T v1,T v2, …)	返回 T 中的最大值【若 T 中存在null，则返回null】
T	least(T v1,T v2, …)	返回 T 中的最小值【若 T 中存在null，则返回null】

-----------------------------------------greatest()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select greatest(3,null,4,1);
+-------+
|  _c0  |
+-------+
| NULL  |
+-------+
1 row selected (0.088 seconds)
0: jdbc:hive2://192.168.150.150:10000> select greatest(3,4,1);
+------+
| _c0  |
+------+
| 4    |
+------+
1 row selected (0.076 seconds)
-----------------------------------------least()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select least(3,null,4,1);
+-------+
|  _c0  |
+-------+
| NULL  |
+-------+
1 row selected (0.492 seconds)
0: jdbc:hive2://192.168.150.150:10000> select least(3,4,1);
+------+
| _c0  |
+------+
| 1    |
+------+
1 row selected (0.078 seconds)

—–☆width_bucket()☆

Operator	OperandTypes	Description
Int	width_bucket(Numeric expr, Numeric min_value, Numeric max_value, Int num_buckets)	对于给定的表达式，返回该表达式的值在计算后将落入的桶数【expr ：为其创建直方图的表达式。此表达式必须计算为数值或日期时间值或可隐式转换为数值或日期时间值的值。如果 expr 计算结果为 null，则表达式返回 null。\| min_value 和 max_value 是解析到 expr 可接受范围的端点的表达式。这两个表达式还必须计算为数值或日期时间值，两者都不能计算为空。\| num_buckets ：是解析为指示存储桶数的常量的表达式。此表达式必须计算为正整数。】

Collection Functions：集合函数

Hive 支持以下内置的集合函数：

—–size()

Operator	OperandTypes	Description
Int	size( Map<K,V> )	返回Map中键值对类型的元素数量
Int	size( Array )	返回Array中数组类型的元素数量

-----------------------------------------size(Map)---------------------------------------
select map_keys(Map("name":"Tony"));   "name":"Tony", "age":18
-----------------------------------------size(Array)---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select size(Array(1,2,3,4,5,6,8,9));
+------+
| _c0  |
+------+
| 8    |
+------+
1 row selected (0.483 seconds)
0: jdbc:hive2://192.168.150.150:10000> select size(Array(1,2,3,4));
+------+
| _c0  |
+------+
| 4    |
+------+
1 row selected (0.081 seconds)

—–map_keys() / map_values()

Operator	OperandTypes	Description
array	map_keys(Map<K.V>)	返回一个映射键的无须数组
array	map_values(Map<K.V>)	返回一个映射值的无须数组

1
2
3

-----------------------------------------map_keys()---------------------------------------

-----------------------------------------map_values()---------------------------------------

array_contains()

Operator	OperandTypes	Description
Boolean	array_contains(Array, value)	如果 Array 中包含 value，则返回 Ture，否则 Flase

0: jdbc:hive2://192.168.150.150:10000> select array_contains(Array(1,2,3,4),1);
+-------+
|  _c0  |
+-------+
| true  |
+-------+
1 row selected (0.086 seconds)
0: jdbc:hive2://192.168.150.150:10000> select array_contains(Array(1,2,3,4),5);
+--------+
|  _c0   |
+--------+
| false  |
+--------+
1 row selected (0.11 seconds)

sort_array(Array)

Operator	OperandTypes	Description
Array	sort_array(Array)	返回排序好后的Array

0: jdbc:hive2://192.168.150.150:10000> select sort_array(Array(1,2,5,3,4));
+--------------+
|     _c0      |
+--------------+
| [1,2,3,4,5]  |
+--------------+
1 row selected (0.132 seconds)
0: jdbc:hive2://192.168.150.150:10000> select sort_array(Array(14,5,89,32,48));
+------------------+
|       _c0        |
+------------------+
| [5,14,32,48,89]  |
+------------------+
1 row selected (0.092 seconds)

Type Conversion Functions：类型转换函数

在 Hive 中支持以下类型转换函数:

—–binary()

Operator	OperandTypes	Description
Binary	binary(string\|binary a)	返回将 a 强制转换为二进制的数据

1 2	`select binary('我爱你'); select bin(111);`

cast()

Operator	OperandTypes	Description
Expected “=” to follow “type”	cast(expr as )	返回将表达式 expr 的结果转换为 type 的结果

0: jdbc:hive2://192.168.150.150:10000> select cast('12345' as int);
+--------+
|  _c0   |
+--------+
| 12345  |
+--------+
1 row selected (0.08 seconds)
0: jdbc:hive2://192.168.150.150:10000> select cast('12345' as float);
+----------+
|   _c0    |
+----------+
| 12345.0  |
+----------+
1 row selected (0.534 seconds)

Date Functions：日期函数

Hive 支持以下内置的日期函数:

from_unixtime()

Operator	OperandTypes	Description
String	from_unixtime(BigInt unixtime[, String format])	返回时间戳(unixtime[单位：秒]) 默认转换为日期(精确到秒)，若指定format格式，则依据格式转换

0: jdbc:hive2://192.168.150.150:10000> select from_unixtime(1646875942);
+----------------------+
|         _c0          |
+----------------------+
| 2022-03-10 01:32:22  |
+----------------------+
1 row selected (0.075 seconds)
0: jdbc:hive2://192.168.150.150:10000> select from_unixtime(1646875942,'yyyy-MM-dd');
+-------------+
|     _c0     |
+-------------+
| 2022-03-10  |
+-------------+
1 row selected (0.081 seconds)
0: jdbc:hive2://192.168.150.150:10000> select from_unixtime(1646875942,'yyyy/MM');
+----------+
|   _c0    |
+----------+
| 2022/03  |
+----------+
1 row selected (0.076 seconds)

unix_timestamp()

Operator	OperandTypes	Description
BigInt	unix_timestamp()	默认返回获取当前 Unix 时间戳【单位：秒】;这个函数不是确定的，它的值对于查询执行的范围也不固定，因此妨碍了查询的适当优化——自从2.0以来，这个函数一直被弃用，而采用 current_ timestamp 常量。
BigInt	unix_timestamp(String date)	默认返回获取 date 的 Unix 时间戳【单位：秒】；将格式为 yyyy-MM-dd HH: mm: ss 的时间字符串 date 转换为 Unix 时间戳
BigInt	unix_timestamp(String date,String format)	默认返回获取指定 format 格式的 date的 Unix 时间戳【单位：秒】；

-----------------------------------------unix_timestamp()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select unix_timestamp();
+-------------+
|     _c0     |
+-------------+
| 1623942105  |
+-------------+
1 row selected (0.493 seconds)
-----------------------------------------unix_timestamp(String date))---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select unix_timestamp('2022-03-10 01:32:22');
+-------------+
|     _c0     |
+-------------+
| 1646875942  |
+-------------+
1 row selected (0.087 seconds)
-----------------------------------------unix_timestamp(String  date,String format)-------------------------
0: jdbc:hive2://192.168.150.150:10000> select unix_timestamp('2022-03-10 01:32:22','yy-MM-dd');
+-------------+
|     _c0     |
+-------------+
| 1646870400  |
+-------------+
1 row selected (0.113 seconds)
0: jdbc:hive2://192.168.150.150:10000> select unix_timestamp('2022-03-10 01:32','yy-MM-dd HH:MM');
+-------------+
|     _c0     |
+-------------+
| 1723251600  |
+-------------+
1 row selected (0.078 seconds)

to_date()

Operator	OperandTypes	Description
String(Hive_version:Pre 2.1.0) \| Date(Hive_version:On 2.1.0)	to_date(String date)	返回时间字符串的日期部分

0: jdbc:hive2://192.168.150.150:10000> select to_date ('2022-03-10 01:32:22');
+-------------+
|     _c0     |
+-------------+
| 2022-03-10  |
+-------------+
1 row selected (0.479 seconds)
0: jdbc:hive2://192.168.150.150:10000> select to_date ('2022-03-10 01:32');
+-------------+
|     _c0     |
+-------------+
| 2022-03-10  |
+-------------+
1 row selected (0.476 seconds)

year()

Operator	OperandTypes	Description
Int	year(String date)	返回时间字符串的年份部分

0: jdbc:hive2://192.168.150.150:10000> select year('2022-03-10 01:32:22');
+-------+
|  _c0  |
+-------+
| 2022  |
+-------+
1 row selected (0.485 seconds)

quarter()

Operator	OperandTypes	Description
Int	quarter(String date)	返回时间字符串的季度部分

0: jdbc:hive2://192.168.150.150:10000> select quarter('2022-03-10 01:32:22');
+------+
| _c0  |
+------+
| 1    |
+------+
1 row selected (0.106 seconds)
0: jdbc:hive2://192.168.150.150:10000> select quarter('2022-06-10');
+------+
| _c0  |
+------+
| 2    |
+------+
1 row selected (0.076 seconds)

month()

Operator	OperandTypes	Description
Int	month(String date)	返回时间字符串的月份部分

0: jdbc:hive2://192.168.150.150:10000> select month('2022-03-10 01:32:22');
+------+
| _c0  |
+------+
| 3    |
+------+
1 row selected (0.077 seconds)
0: jdbc:hive2://192.168.150.150:10000> select month('2022-06-10');
+------+
| _c0  |
+------+
| 6    |
+------+
1 row selected (0.482 seconds)

day() / dayofmonth()/dayofweek()

Operator	OperandTypes	Description
Int	day(String date)	返回时间字符串的日部分
Int	dayofweek(Date date \|String date)	返回时间字符串的当周的第几天【因为返回的天数为是外国的时间习惯，故我们会自定义函数重写该函数，改成符合 China 的时间习惯】
Int	dayofmonth(String date)	返回时间字符串的在当月的第几天

-----------------------------------------day()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select day('2022-03-10 01:32:22');
+------+
| _c0  |
+------+
| 10   |
+------+
1 row selected (0.089 seconds)
-----------------------------------------dayofweek()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select dayofweek(current_date()), current_date();
+------+-------------+
| _c0  |     _c1     |
+------+-------------+
| 7    | 2021-06-19  |
+------+-------------+
1 row selected (0.081 seconds)
-- 2021-06-19 为周六！(故需要重写...)
-----------------------------------------dayofmonth()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select dayofmonth('2022-06-12');
+------+
| _c0  |
+------+
| 12   |
+------+
1 row selected (0.104 seconds)

hour()

Operator	OperandTypes	Description
Int	hour(String date)	返回时间字符串的小时部分

0: jdbc:hive2://192.168.150.150:10000> select hour('2022-03-10 01:32:22');
+------+
| _c0  |
+------+
| 1    |
+------+
1 row selected (0.499 seconds)
0: jdbc:hive2://192.168.150.150:10000> select hour('2022-06-12');
+------+
| _c0  |
+------+
| 0    |
+------+
1 row selected (0.484 seconds)

minute()

Operator	OperandTypes	Description
Int	minute(String date)	返回时间字符串的分钟部分

0: jdbc:hive2://192.168.150.150:10000> select minute('2022-03-10 01:32:22');
+------+
| _c0  |
+------+
| 32   |
+------+
1 row selected (0.08 seconds)
0: jdbc:hive2://192.168.150.150:10000> select minute('2022-06-12');
+------+
| _c0  |
+------+
| 0    |
+------+
1 row selected (0.492 seconds)

second()

Operator	OperandTypes	Description
Int	second(String date)	返回时间字符串的秒部分

0: jdbc:hive2://192.168.150.150:10000> select second('2022-03-10 01:32:22');
+------+
| _c0  |
+------+
| 22   |
+------+
1 row selected (0.499 seconds)
0: jdbc:hive2://192.168.150.150:10000> select second('2022-06-12');
+------+
| _c0  |
+------+
| 0    |
+------+
1 row selected (0.074 seconds)

weekofyear()

Operator	OperandTypes	Description
Int	weekofyear(String date)	返回时间字符串的周号【该年中的第几周】

0: jdbc:hive2://192.168.150.150:10000> select weekofyear('2021-01-02');
+------+
| _c0  |
+------+
| 53   |
+------+
1 row selected (0.093 seconds)
-- 时间明明是年初啊，为什么周数这么多呢？
-- 因为这个时间的周数还未到周末，还是属于去年的周数，故该时间的周数计算的是去年的周数。
0: jdbc:hive2://192.168.150.150:10000> select weekofyear('2021-01-10');
+------+
| _c0  |
+------+
| 1    |
+------+
1 row selected (0.063 seconds)
0: jdbc:hive2://192.168.150.150:10000> select weekofyear('2022-01-10 01:32:22');
+------+
| _c0  |
+------+
| 2    |
+------+
1 row selected (0.075 seconds)
0: jdbc:hive2://192.168.150.150:10000> select weekofyear('2022-01-10');
+------+
| _c0  |
+------+
| 2    |
+------+
1 row selected (0.49 seconds)

interval()

Operator	OperandTypes	Description
String	+\|- interval String num field	该函数让在时间的加减上更为简单；【field：year、month、day、 hour、 minute、 second ；field 的顺序需要按照时间的格式依次从 year 到second】
String	+\|- interval String nums field to field	在 interval 函数的进行多个字段同时做同样的运算（当然执行不同的运算需要依次调用interval函数进行运算）【field to field ：field只能为相邻的两个字段，不能越级且大单位在前【目前测只能year to month 可用；多个字段连续运算，目前测得只有day to second 可用】;年月日的分隔符为’-‘,时分秒的分隔符为’:’，且时间的运算基于当前时间，不是预定义的时间字符串】

0: jdbc:hive2://192.168.150.150:10000> select current_date() - interval '2' year;
+-------------+
|     _c0     |
+-------------+
| 2019-06-19  |
+-------------+
1 row selected (0.515 seconds)
0: jdbc:hive2://192.168.150.150:10000> select current_date() - interval '2' month;
+-------------+
|     _c0     |
+-------------+
| 2021-04-19  |
+-------------+
1 row selected (0.476 seconds)
0: jdbc:hive2://192.168.150.150:10000> select current_date() - interval '2' year + interval '3' month;
+-------------+
|     _c0     |
+-------------+
| 2019-09-19  |
+-------------+
1 row selected (0.087 seconds)
0: jdbc:hive2://192.168.150.150:10000> select current_date() - interval '2-3' year to month;
+-------------+
|     _c0     |
+-------------+
| 2019-03-19  |
+-------------+
0: jdbc:hive2://192.168.150.150:10000> select current_timestamp - interval '3 12:20:30' day to second;
+--------------------------+
|           _c0            |
+--------------------------+
| 2021-06-15 22:28:38.168  |
+--------------------------+
1 row selected (0.075 seconds)
0: jdbc:hive2://192.168.150.150:10000> select current_timestamp - interval '3 12:20' day to minute;
Error: Error while compiling statement: FAILED: ParseException line 1:50 cannot recognize input near 'to' 'minute' '<EOF>' in selection target (state=42000,code=40000)
-- 该问题不知什么情况，希望了解的小伙伴评论一起解决吧...

extract()

Operator	OperandTypes	Description
Int	extract(field From source)	从源检索字段；【源：时间字符串；字段：day、 dayofweek、 hour、 minute、 month、 quarter、 second、 week 和 year】

0: jdbc:hive2://192.168.150.150:10000> select extract(year from '2022-01-10 01:32:22');
+-------+
|  _c0  |
+-------+
| 2022  |
+-------+
1 row selected (0.085 seconds)
0: jdbc:hive2://192.168.150.150:10000> select extract(quarter from '2022-01-10 01:32:22');
+------+
| _c0  |
+------+
| 1    |
+------+
1 row selected (0.112 seconds)
0: jdbc:hive2://192.168.150.150:10000> select extract(month from '2022-01-10 01:32:22');
+------+
| _c0  |
+------+
| 1    |
+------+
1 row selected (0.082 seconds)
0: jdbc:hive2://192.168.150.150:10000> select extract(week from '2022-01-10 01:32:22');
+------+
| _c0  |
+------+
| 2    |
+------+
1 row selected (0.104 seconds)
0: jdbc:hive2://192.168.150.150:10000> select extract(month from interval '1-3' year to month);
+------+
| _c0  |
+------+
| 3    |
+------+
1 row selected (0.064 seconds)
0: jdbc:hive2://192.168.150.150:10000> select extract(minute from interval '3 12:20:30' day to second) ;
+------+
| _c0  |
+------+
| 20   |
+------+
1 row selected (0.064 seconds)

datediff()

Operator	OperandTypes	Description
Int	datediff( String enddate, String startdate)	返回开始时间到结束日期的天数

0: jdbc:hive2://192.168.150.150:10000> select datediff(current_date(), '2008-08-08');
+-------+
|  _c0  |
+-------+
| 4698  |
+-------+
1 row selected (0.059 seconds)

date_add()

Operator	OperandTypes	Description
String(Hive_version:Pre 2.1.0) \| Date(Hive_version:On 2.1.0)	date_add(date/timestamp/string startdate, tinyint/smallint/int days)	返回向开始时期加上一个天数的日期

0: jdbc:hive2://192.168.150.150:10000> select date_add(current_date(), 10);
+-------------+
|     _c0     |
+-------------+
| 2021-06-29  |
+-------------+
1 row selected (0.085 seconds)
0: jdbc:hive2://192.168.150.150:10000> select date_add(current_date(), -10);
+-------------+
|     _c0     |
+-------------+
| 2021-06-09  |
+-------------+
1 row selected (0.072 seconds)

date_sub()

Operator	OperandTypes	Description
String(Hive_version:Pre 2.1.0) \| Date(Hive_version:On 2.1.0)	date_sub(date/timestamp/string startdate, tinyint/smallint/int days)	返回向开始时期减去一个天数的日期

0: jdbc:hive2://192.168.150.150:10000> select date_sub(current_date(), -10);
+-------------+
|     _c0     |
+-------------+
| 2021-06-29  |
+-------------+
1 row selected (0.469 seconds)
0: jdbc:hive2://192.168.150.150:10000> select date_sub(current_date(), 10);
+-------------+
|     _c0     |
+-------------+
| 2021-06-09  |
+-------------+
1 row selected (0.071 seconds)

from_utc_timestamp() / to_utc_timestamp()

Operator	OperandTypes	Description
Timestamp	from_utc_timestamp(Timestamp ts\| String date , String timezone)	返回将UTC格式的date，转换为指定时区的时间
Timestamp	to_utc_timestamp(Timestamp ts\| String date , String timezone)	返回将date按照指定时区的时间转换为UTC格式的时间

-----------------------------------------from_utc_timestamp()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select from_utc_timestamp('1970-01-01 00:1654:00','PRC');
+------------------------+
|          _c0           |
+------------------------+
| 1970-01-01 08:00:00.0  |
+------------------------+
1 row selected (0.066 seconds)
-- PRC为北京时间(没记错是东八区)
0: jdbc:hive2://192.168.150.150:10000> select from_utc_timestamp(0,'PRC');
+------------------------+
|          _c0           |
+------------------------+
| 1970-01-01 08:00:00.0  |
+------------------------+
1 row selected (0.073 seconds)
-----------------------------------------to_utc_timestamp()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select to_utc_timestamp(0,'PRC');
+------------------------+
|          _c0           |
+------------------------+
| 1969-12-31 16:00:00.0  |
+------------------------+
1 row selected (0.068 seconds)

current_date()

Operator	OperandTypes	Description
Date	current_date()	返回当前系统的时间

0: jdbc:hive2://192.168.150.150:10000> select current_date();
+-------------+
|     _c0     |
+-------------+
| 2021-06-19  |
+-------------+
1 row selected (0.104 seconds)

current_timestamp()

Operator	OperandTypes	Description
Timestamp	current_timestamp()	返回当前系统的时间戳【单位：毫秒】

0: jdbc:hive2://192.168.150.150:10000> select current_timestamp();
+--------------------------+
|           _c0            |
+--------------------------+
| 2021-06-19 11:24:40.936  |
+--------------------------+
1 row selected (0.07 seconds)

add_months()

Operator	OperandTypes	Description
String	add_months( String start_date, Int num_months,[,output_date_format])	返回给定时间增加的月份数，默认返回格式：’yyyy-MM-dd’,如果给定输出时间格式，则按指定格式返回时间格式

0: jdbc:hive2://192.168.150.150:10000> select add_months(current_date(),3);
+-------------+
|     _c0     |
+-------------+
| 2021-09-19  |
+-------------+
1 row selected (0.06 seconds)
0: jdbc:hive2://192.168.150.150:10000> select add_months(current_date(),3,'yyyy-MM-dd HH:MM:SS');
+----------------------+
|         _c0          |
+----------------------+
| 2021-09-19 00:09:00  |
+----------------------+
1 row selected (0.061 seconds)

last_day()

Operator	OperandTypes	Description
String	last_day(String date)	返回该日期所属月的最后一天

0: jdbc:hive2://192.168.150.150:10000> select last_day(current_date());
+-------------+
|     _c0     |
+-------------+
| 2021-06-30  |
+-------------+
1 row selected (0.063 seconds)

next_day()

Operator	OperandTypes	Description
String	next_day( String start_date, String day_of_week)	返回start_date 的最近的下一个day_of_week【day_of_week:mo(2);mon(3);monday(全称)】

0: jdbc:hive2://192.168.150.150:10000> select next_day(current_date(),'mo');
+-------------+
|     _c0     |
+-------------+
| 2021-06-21  |
+-------------+
1 row selected (0.066 seconds)
0: jdbc:hive2://192.168.150.150:10000> select next_day(current_date(),'mon');
+-------------+
|     _c0     |
+-------------+
| 2021-06-21  |
+-------------+
1 row selected (0.06 seconds)
0: jdbc:hive2://192.168.150.150:10000> select next_day(current_date(),'monday');
+-------------+
|     _c0     |
+-------------+
| 2021-06-21  |
+-------------+
1 row selected (0.061 seconds)

trunc()

Operator	OperandTypes	Description
String	trunc(String date, String format )	返回 date 对应 format 单位的第一天日期【supported formats:MONTH/MON/MM,YEAR/YYYY/YY】

0: jdbc:hive2://192.168.150.150:10000> select trunc(current_date(),'YYYY');
+-------------+
|     _c0     |
+-------------+
| 2021-01-01  |
+-------------+
1 row selected (0.46 seconds)
0: jdbc:hive2://192.168.150.150:10000> select trunc(current_date(),'MONTH');
+-------------+
|     _c0     |
+-------------+
| 2021-06-01  |
+-------------+
1 row selected (0.084 seconds)

months_between()

Operator	OperandTypes	Description
Double	months_between(String date1,String date2)	返回两个日期的天数【如果date1晚于date2，那么结果是正的。如果date1早于date2，则结果为负值。如果date1和date2是一个月的同一天或两个月的最后一天，那么结果总是一个整数。否则，UDF根据31天的月份计算结果的小数部分，并考虑date1和date2时间成分的差异。】

0: jdbc:hive2://192.168.150.150:10000> select months_between('2021-5-19','2021-7-18');
+--------------+
|     _c0      |
+--------------+
| -1.96774194  |
+--------------+
1 row selected (0.063 seconds)
0: jdbc:hive2://192.168.150.150:10000> select months_between('2021-5-30','2021-7-30');
+-------+
|  _c0  |
+-------+
| -2.0  |
+-------+
1 row selected (0.075 seconds)

date_format()

Operator	OperandTypes	Description
String	date_format(Date/String ts, String format)	返回将 ts 对应的 format 格式时间

0: jdbc:hive2://192.168.150.150:10000> select date_format(current_date(), 'yyyy-MM-dd');
+-------------+
|     _c0     |
+-------------+
| 2021-06-19  |
+-------------+
1 row selected (0.071 seconds)
0: jdbc:hive2://192.168.150.150:10000> select date_format(current_date(), 'yyyy-MM-dd HH:MM:SS');
+----------------------+
|         _c0          |
+----------------------+
| 2021-06-19 00:06:00  |
+----------------------+
1 row selected (0.479 seconds)

Conditional Functions：条件函数

if()

Operator	OperandTypes	Description
T	if(Boolean testCondition,T valuetrue, T valueFalseOrNull)	当 testCondition 为 true 时返回 valueTrue，否则返回 valueFalseOrNull

0: jdbc:hive2://192.168.150.150:10000> select if(1>2, 1111,2222);
+-------+
|  _c0  |
+-------+
| 2222  |
+-------+
1 row selected (0.139 seconds)
-- 与Java、Python等中的三元表达式相似

isnull() / isnotnull()

Operator	OperandTypes	Description
Boolean	isnull(a)	如果 a 为 NULL，则返回 true，否则返回 false
Boolean	isnotnull(a)	如果 a 不为 NULL，则返回 true，否则返回 false

-----------------------------------------isnull()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select isnull(111);
+--------+
|  _c0   |
+--------+
| false  |
+--------+
1 row selected (0.106 seconds)
-----------------------------------------isnotnull()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select isnotnull(null);
+--------+
|  _c0   |
+--------+
| false  |
+--------+
1 row selected (0.076 seconds)

nvl()

Operator	OperandTypes	Description
T	nvl(T value, T default_value)	如果 value 不为空，则返回value，否则返回default_value

0: jdbc:hive2://192.168.150.150:10000> select nvl(null,1111);
+-------+
|  _c0  |
+-------+
| 1111  |
+-------+
1 row selected (0.066 seconds)
0: jdbc:hive2://192.168.150.150:10000> select nvl(2222,1111);
+-------+
|  _c0  |
+-------+
| 2222  |
+-------+
1 row selected (0.07 seconds)

coalesce()

Operator	OperandTypes	Description
T	coalesce( T v1,T v2,…)	返回第一个非NULL的值，如果全是NULL，则返回NULL

0: jdbc:hive2://192.168.150.150:10000> select coalesce(null,1111);
+-------+
|  _c0  |
+-------+
| 1111  |
+-------+
1 row selected (0.059 seconds)
0: jdbc:hive2://192.168.150.150:10000> select coalesce(null,null);
+-------+
|  _c0  |
+-------+
| NULL  |
+-------+
1 row selected (0.071 seconds)

—–case()

Operator	OperandTypes	Description
T	case a when b then c [when d then e] * [else f] end	当 a = b 时，返回 c; 当 a = d 时，返回 e; else 返回 f
T	case when a then b [when c then d] * [else e] end	当 a = true 时，返回 b; 当 c = true 时，返回 d; else 返回 e

nullif()

Operator	OperandTypes	Description
T	nullif(a,b)	如果 a = b，则返回NULL，否则返回 a；【case语句：case when a = b then NULL else a】

0: jdbc:hive2://192.168.150.150:10000> select nullif(1,2);
+------+
| _c0  |
+------+
| 1    |
+------+
1 row selected (0.057 seconds)
0: jdbc:hive2://192.168.150.150:10000> select nullif(1,1);
+-------+
|  _c0  |
+-------+
| NULL  |
+-------+
1 row selected (0.068 seconds)

assert_true()

Operator	OperandTypes	Description
void	assert_true( Boolean condition)	如果 condition 不为 true，则抛出异常，否则返回 null

0: jdbc:hive2://192.168.150.150:10000> select assert_true(1>2);
Error: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: ASSERT_TRUE(): assertion failed. (state=,code=0)
0: jdbc:hive2://192.168.150.150:10000> select assert_true(1<2);
+-------+
|  _c0  |
+-------+
| NULL  |
+-------+
1 row selected (0.069 seconds)

String Functions:字符串函数

在 Hive 中支持一下内置的字符串函数：

ascii()

Operator	OperandTypes	Description
Int	ascii( String str)	返回 str 的第一个字符的阿斯克码

0: jdbc:hive2://192.168.150.150:10000> select ascii('LOVE');
+------+
| _c0  |
+------+
| 76   |
+------+
1 row selected (0.149 seconds)
0: jdbc:hive2://192.168.150.150:10000> select ascii('L');
+------+
| _c0  |
+------+
| 76   |
+------+
1 row selected (0.066 seconds)

base64() / unbase64()

Operator	OperandTypes	Description
String	base64( Binary bin)	返回将参数从二进制转换为base64后的字符串
Binary	unbase64(String str)	返回将字符串 str 转换为base64后的Binary

-----------------------------------------base64()---------------------------------------: jdbc:hive2://192.168.150.150:10000> select base64(binary('LOVE'));
+-----------+
|    _c0    |
+-----------+
| TE9WRQ==  |
+-----------+
1 row selected (0.487 seconds)
-----------------------------------------unbase64()---------------------------------------0: jdbc:hive2://192.168.150.150:10000> select unbase64('TE9WRQ==');
+-------+
|  _c0  |
+-------+
| LOVE  |
+-------+
1 row selected (0.468 seconds)

character_length()

Operator	OperandTypes	Description
Int	character_length(string str)	返回str中包含的 UTF-8 的字符数

0: jdbc:hive2://192.168.150.150:10000> select character_length('LOVE');
+------+
| _c0  |
+------+
| 4    |
+------+
1 row selected (0.068 seconds)

chr()

Operator	OperandTypes	Description
String	char( BigInt \| Double a)	返回与 a 等价的二进制字符的ASCII字符

0: jdbc:hive2://192.168.150.150:10000> select ascii('L');
+------+
| _c0  |
+------+
| 76   |
+------+
1 row selected (0.065 seconds)
0: jdbc:hive2://192.168.150.150:10000> select chr(76);
+------+
| _c0  |
+------+
| L    |
+------+
1 row selected (0.063 seconds)

concat()

Operator	OperandTypes	Description
String	concat(String\|Binary A, String\|Binary B…)	按顺序返回串联传入的字符串或字节所产生的字符串或字节

0: jdbc:hive2://192.168.150.150:10000> select concat('LOVE','YOU'),concat('foot','ball');
+----------+-----------+
|   _c0    |    _c1    |
+----------+-----------+
| LOVEYOU  | football  |
+----------+-----------+
1 row selected (0.467 seconds)

concat_ws()

Operator	OperandTypes	Description
String	concat_ws(String sep, String a, String b,…)	类似于上面的concat(), 但是有定制的 sep 分隔符
String	concat_ws(String sep, Array)	与上面的 concat _ ws ()类似,用定制的sep分割符将array元素连接

0: jdbc:hive2://192.168.150.150:10000> select concat_ws('_','LOVE','YOU');
+-----------+
|    _c0    |
+-----------+
| LOVE_YOU  |
+-----------+
1 row selected (0.064 seconds)
0: jdbc:hive2://192.168.150.150:10000> select concat_ws('_',Array('foot','ball','basket','ball'));
+------------------------+
|          _c0           |
+------------------------+
| foot_ball_basket_ball  |
+------------------------+
1 row selected (0.076 seconds)

decode()/encode()

Operator	OperandTypes	Description
String	decode(Binary bin, String charset)	提供charset 将 bin 解码为字符串【charset：”US-ASCII”、”ISO-8859-1”、”UTF-8”、”UTF-16BE”、”UTF-16LE”、”UTF-16”】
String	encode(String src, String charset)	提供charset 将 src解码为字符串【charset：”US-ASCII”、”ISO-8859-1”、”UTF-8”、”UTF-16BE”、”UTF-16LE”、”UTF-16”】

-----------------------------------------decode()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select decode(binary('LOVE'),'UTF-16LE'),decode(binary('ME'),'UTF-16LE');
+------+------+
| _c0  | _c1  |
+------+------+
| 佌䕖   | 䕍    |
+------+------+
1 row selected (0.46 seconds)
-----------------------------------------encode()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000>  select encode('佌䕖','UTF-16LE'),encode('䕍','UTF-16LE');
+-------+------+
|  _c0  | _c1  |
+-------+------+
| LOVE  | ME   |
+-------+------+
1 row selected (0.085 seconds)

elt()

Operator	OperandTypes	Description
String	elt(N Int,String str1, Stringstr2 ,String str3,…)	返回索引号处的字符串【如果 n 小于1或大于参数数目，则返回 NULL。】

0: jdbc:hive2://192.168.150.150:10000> select elt(2, 'I','LOVE','YOU');
+-------+
|  _c0  |
+-------+
| LOVE  |
+-------+
1 row selected (0.597 seconds)
0: jdbc:hive2://192.168.150.150:10000> select elt(4, 'I','LOVE','YOU');
+-------+
|  _c0  |
+-------+
| NULL  |
+-------+
1 row selected (0.096 seconds)

field()

Operator	OperandTypes	Description
Int	field(val T,val1 T,val2 T,val3 T,…)	返回 val1、 val2、 val3、 … 列表中 val首次出现的索引，如果没有找到则返回0【支持所有基本类型，使用 str.equals (x)比较参数。如果 val 为 NULL，则返回值为0】

0: jdbc:hive2://192.168.150.150:10000> select field( 'Love','I','LOVE','YOU','ME','TOO'),field(1,1,1,1);
+------+------+
| _c0  | _c1  |
+------+------+
| 0    | 1    |
+------+------+
1 row selected (0.106 seconds)

find_in_set()

Operator	OperandTypes	Description
Int	find_in_set(String str, String strList)	返回 strList 中 str 首次出现的下标，其中 strList 是逗号分隔的字符串【如果任一参数为 null，则返回 null；如果第一个参数包含逗号，则返回0】

0: jdbc:hive2://192.168.150.150:10000> select find_in_set( 'LOVE','I,LOVE,YOU,ME,TOO'),find_in_set('NULL','say,hello,world');
+------+------+
| _c0  | _c1  |
+------+------+
| 2    | 0    |
+------+------+
1 row selected (0.09 seconds)

format_number()

Operator	OperandTypes	Description
String	format_number(Number x, Int d)	将数字 x 格式化为’# ,#### ,#### .##’，四舍五入到小数点后的 d 位，并以字符串形式返回结果【如果 d 是0，则结果没有小数点或小数部分】

0: jdbc:hive2://192.168.150.150:10000> select format_number(123456789.12345,3),format_number(123456789.12345,0);
+------------------+--------------+
|       _c0        |     _c1      |
+------------------+--------------+
| 123,456,789.123  | 123,456,789  |
+------------------+--------------+
1 row selected (0.103 seconds)

get_json_object()

Operator	OperandTypes	Description
String	get_json_object(String json_string, String path)	基于指定的 json 路径从 json 字符串中提取 json 对象，并返回提取的 json 对象的 json 字符串【一次解析一个，但可以多层,如果输入 json 字符串无效，则返回 null。注意: json 路径只能有字符[0-9\|a-z] ，即没有大写字母或特殊字符。此外，键 * 不能以数字开头。* 这是由于对 Hive 列名的限制。】

0: jdbc:hive2://192.168.150.150:10000> select get_json_object('{"sentence":"I LOVE YOU","age":18,"date":"2016-12-10"}','$.date');
+-------------+
|     _c0     |
+-------------+
| 2016-12-10  |
+-------------+
1 row selected (0.548 seconds)
0: jdbc:hive2://192.168.150.150:10000> select get_json_object('{"sentence":"I LOVE YOU","age":18,"date":"2016-12-10"}','$.name');
+-------+
|  _c0  |
+-------+
| NULL  |
+-------+
1 row selected (0.156 seconds)
0: jdbc:hive2://192.168.150.150:10000> select get_json_object('{"sentence":"I LOVE YOU","age":18,"date":"2016-12-10","person":{"person1":"demo"}}','$.person.person1');
+-------+
|  _c0  |
+-------+
| demo  |
+-------+
1 row selected (0.149 seconds)

json_tuple()

Operator	OperandTypes	Description
String1,…,Stringn	json_tuple(String jsonStr,String k1,…,String kn)	基于指定的 json 路径从 json 字符串中提取 json 对象，并返回提取的 json 对象的 json 字符串【一次解析多个，但只能一层,如果输入 json 字符串无效，则返回 null。注意: json 路径只能有字符[0-9\|a-z] ，即没有大写字母或特殊字符。此外，键 * 不能以数字开头。* 这是由于对 Hive 列名的限制。】

0: jdbc:hive2://192.168.150.150:10000> select json_tuple('{"sentence":"I LOVE YOU","age":18,"date":"2016-12-10"}','date','sentence','name');
+-------------+-------------+-------+
|     c0      |     c1      |  c2   |
+-------------+-------------+-------+
| 2016-12-10  | I LOVE YOU  | NULL  |
+-------------+-------------+-------+

in_file()

Operator	OperandTypes	Description
Boolean	in_file(String str, String filename)	如果字符串 str 在文件名中显示为整行，则返回 true。

0: jdbc:hive2://192.168.150.150:10000> select in_file('aaa', '/scripts/student.log'); 
+-------+
|  _c0  |
+-------+
| true  |
+-------+
1 row selected (0.152 seconds)
0: jdbc:hive2://192.168.150.150:10000> select in_file('bbb', '/scripts/student.log'); 
+--------+
|  _c0   |
+--------+
| false  |
+--------+
1 row selected (0.142 seconds)

instr()

Operator	OperandTypes	Description
Int	instr(String str, String substr)	返回 str 中 substr 第一次出现的位置【如果两个参数中有一个为 null，返回 null; 如果在 str 中找不到 substr，返回0。请注意，这不是从零开始的。Str 中的第一个字符的索引为1】

0: jdbc:hive2://192.168.150.150:10000> select instr('I,LOVE,YOU,ME,TOO','LOVE'),instr('I,LOVE,YOU,ME,TOO','null');
+------+------+
| _c0  | _c1  |
+------+------+
| 3    | 0    |
+------+------+
1 row selected (0.168 seconds)

length()

Operator	OperandTypes	Description
Int	length(String A)	返回字符串的长度【空格也算一个字符】

0: jdbc:hive2://192.168.150.150:10000> select length('I LOVE YOU'),length('LOVE');
+------+------+
| _c0  | _c1  |
+------+------+
| 10   | 4    |
+------+------+
1 row selected (0.136 seconds)

locate()

Operator	OperandTypes	Description
Int	locate(String substr, String str[, Int pos])	返回位置 pos 后 str 中第一个出现substr 的位置【找不到substr则返回0】

0: jdbc:hive2://192.168.150.150:10000> select locate('LOVE','I LOVE YOU'),locate('L','I LOVE YOU',5),locate('L','I LOVE YOU L',5);
+------+------+------+
| _c0  | _c1  | _c2  |
+------+------+------+
| 3    | 0    | 12   |
+------+------+------+
1 row selected (0.565 seconds)

lower() / lcase()

Operator	OperandTypes	Description
String	lower / lcase(String A)	返回将 A 的所有字符转换为小写所得到的字符串

0: jdbc:hive2://192.168.150.150:10000> select lower('LOVE'),lcase('LOVE');
+-------+-------+
|  _c0  |  _c1  |
+-------+-------+
| love  | love  |
+-------+-------+
1 row selected (0.129 seconds)

lpad()

Operator	OperandTypes	Description
String	lpad(String str, Int len, String pad)	返回在 str左边填充了一个长度为 len 的填充区域【如果 str 长于 len，则返回值缩短为 len 字符。对于空填充字符串，返回值为空】

0: jdbc:hive2://192.168.150.150:10000> select lpad('LOVE',2,'I'), lpad('LOVE',6,'I'), lpad('',0,'I');
+------+---------+------+
| _c0  |   _c1   | _c2  |
+------+---------+------+
| LO   | IILOVE  |      |
+------+---------+------+
1 row selected (0.129 seconds)

ltrim() / rtrim()

Operator	OperandTypes	Description
String	ltrim(String A)	返回从 A 的开头(左侧)去空得到的字符串
String	rtrim(string A)	返回从 A 的尾部(右侧)去空得到的字符串

-----------------------------------------ltrim()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select ltrim('  LOVE'),ltrim('  I LOVE');
+-------+---------+
|  _c0  |   _c1   |
+-------+---------+
| LOVE  | I LOVE  |
+-------+---------+
1 row selected (0.536 seconds)
-----------------------------------------rtrim()---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select rtrim('LOVE  '),rtrim('I LOVE  ');
+-------+---------+
|  _c0  |   _c1   |
+-------+---------+
| LOVE  | I LOVE  |
+-------+---------+
1 row selected (0.508 seconds)

octet_length()

Operator	OperandTypes	Description
Int	octet_length(string str)	返回在 UTF-8编码中保存字符串 str 所需的八位字节数【中文的UTF-8的字节数为一个字三个字节】

0: jdbc:hive2://192.168.150.150:10000> select octet_length('LOVE'),octet_length('我爱你');
+------+------+
| _c0  | _c1  |
+------+------+
| 4    | 9    |
+------+------+
1 row selected (0.155 seconds)

—–parse_url()

Operator	OperandTypes	Description
String	parse_url(String urlString, String partToExtract [, String keyToExtract])	从 URL 返回指定的部分，一次只能提取一个part【partToExtract 的有效值包括 HOST:域名,PROTOCOL:协议，PORT：端口，REF：引用来源，PATH：路径，QUERY：解析—可指定QUERY解析的part、 AUTHORITY:、 FILE :和 USERINFO:】

-----------------------------------------HOST---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select parse_url('https://yangyangmm.cn/2021/01/14/Python函数/?k1=v1&k2=v2','HOST'); 
+----------------+
|      _c0       |
+----------------+
| yangyangmm.cn  |
+----------------+
1 row selected (0.131 seconds)

-----------------------------------------PATH---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select parse_url('https://yangyangmm.cn/2021/01/06/Python函数/?k1=v1&k2=v2','PATH'); 
+------------------------+
|          _c0           |
+------------------------+
| /2021/01/06/Python函数/  |
+------------------------+
1 row selected (0.167 seconds)

-----------------------------------------QUERY---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select parse_url('https://yangyangmm.cn/2021/01/06/Python函数/?k1=v1&k2=v2','QUERY;) 
+--------------+
|     _c0      |
+--------------+
| k1=v1&k2=v2  |
+--------------+
1 row selected (0.52 seconds)

0: jdbc:hive2://192.168.150.150:10000> select parse_url('https://yangyangmm.cn/2021/01/06/Python函数/?k1=v1&k2=v2','QUERY'k1');
+------+
| _c0  |
+------+
| v1   |
+------+
1 row selected (0.57 seconds)

-----------------------------------------REF---------------------------------------
select parse_url('https://yangyangmm.cn/2021/01/06/Python函数/?k1=v1&k2=v2','REF');
-----------------------------------------PROTOCOL---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select parse_url('https://yangyangmm.cn/2021/01/06/Python函数/?k1=v1&k2=v2','PROTOL');
+--------+
|  _c0   |
+--------+
| https  |
+--------+
1 row selected (0.123 seconds)

-----------------------------------------AUTHORITY---------------------------------------
0: jdbc:hive2://192.168.150.150:10000> select parse_url('https://yangyangmm.cn/2021/01/06/Python函数/?k1=v1&k2=v2','AUTHOTY');
+----------------+
|      _c0       |
+----------------+
| yangyangmm.cn  |
+----------------+
1 row selected (0.142 seconds)

-----------------------------------------FILE---------------------------------------
select parse_url('https://yangyangmm.cn/2021/01/06/Python函数/?k1=v1&k2=v2','FILE ');
-----------------------------------------USERINFO---------------------------------------
select parse_url('https://yangyangmm.cn/2021/01/06/Python函数/?k1=v1&k2=v2','USERINFO');

parse_url_tuple()

Operator	OperandTypes	Description
String1,…,StirngN	parse_url_tuple(String urlStr,String p1,…,String pn)	从 URL 返回指定的部分，一次能提取多个part【pn 的有效值包括 HOST:域名,PROTOCOL:协议，PORT：端口，REF：引用来源，PATH：路径，QUERY：解析—可指定QUERY解析的part、 AUTHORITY:、 FILE :和 USERINFO:】

0: jdbc:hive2://192.168.150.150:10000> select parse_url_tuple('https://yangyangmm.cn/2021/01/14/Python函数/?k1=v1&k2=','HOST','PATH','QUERY:k1');
+----------------+------------------------+-----+
|       c0       |           c1           | c2  |
+----------------+------------------------+-----+
| yangyangmm.cn  | /2021/01/14/Python函数/  | v1  |
+----------------+------------------------+-----+
1 row selected (0.119 seconds)

printf()

Operator	OperandTypes	Description
String	printf(String format, Obj… args)	返回将参数 args 按照 format 格式化

0: jdbc:hive2://192.168.150.150:10000> select printf('name:%s,age:%d,salary:%.2f','LOVE',18,34234.4534);
+-----------------------------------+
|                _c0                |
+-----------------------------------+
| name:LOVE,age:18,salary:34234.45  |
+-----------------------------------+

quote()

Operator	OperandTypes	Description
String	quote(String text)	返回带引号的字符串【该函数在hive4.0版本中应用】

regexp_extract()

Operator	OperandTypes	Description
String	regexp_extract(String subject, String pattern,Int index)	将字符串subject按照pattern正则表达式的规则拆分，返回index指定的字符。【^ 表示开头；$ 表示结尾；. 表示任意字符；* 表示任意多个】

0: jdbc:hive2://192.168.150.150:10000> select regexp_extract('3,300,1,27.01,"Soup - Campbells, Minestrone",2018/7/17,11:53','(.*?),(.*?),(.*?),(.*?),"?(.*?)"?,(\\d{4}/\\d{1,2}/\\d{1,2}),(.*?)',4) price;
+--------+
| price  |
+--------+
| 27.01  |
+--------+
1 row selected (0.116 seconds)
0: jdbc:hive2://192.168.150.150:10000> 
0: jdbc:hive2://192.168.150.150:10000> select regexp_extract('3,300,1,27.01,"Soup - Campbells, Minestrone",2018/7/17,11:53','(.*?),(.*?),(.*?),(.*?),"?(.*?)"?,(\\d{4}/\\d{1,2}/\\d{1,2}),(.*?)',5) product;
+-------------------------------+
|            product            |
+-------------------------------+
| Soup - Campbells, Minestrone  |
+-------------------------------+
1 row selected (0.116 seconds)
0: jdbc:hive2://192.168.150.150:10000> 
0: jdbc:hive2://192.168.150.150:10000> select regexp_extract('3,300,1,27.01,"Soup - Campbells, Minestrone",2018/7/17,11:53','(.*?),(.*?),(.*?),(.*?),"?(.*?)"?,(\\d{4}/\\d{1,2}/\\d{1,2}),(.*?)',6) tran_date;
+------------+
| tran_date  |
+------------+
| 2018/7/17  |
+------------+
1 row selected (0.108 seconds)

regexp_replace()

Operator	OperandTypes	Description
String	regexp_replace(String initial_string, String pattern, String replacement)	将字符串 initial_string 中的符合正则表达式 pattern 的部分替换为 replacement

0: jdbc:hive2://192.168.150.150:10000> select regexp_replace('L234VE', '\\d+', 'O'),regexp_replace('123abc456def789','[a-z]+',' ');
+-------+--------------+
|  _c0  |     _c1      |
+-------+--------------+
| LOVE  | 123 456 789  |
+-------+--------------+
1 row selected (0.125 seconds)

replace(string A, string OLD, string NEW)

Operator	OperandTypes	Description
String	replace(String A, String OLD, String NEW)	返回将 A 中的 OLD 替换为 NEW

0: jdbc:hive2://192.168.150.150:10000> select replace('L234VE', '234', 'O'),replace('123abc456abc789','abc',' ');
+-------+--------------+
|  _c0  |     _c1      |
+-------+--------------+
| LOVE  | 123 456 789  |
+-------+--------------+
1 row selected (0.109 seconds)

repeat()

Operator	OperandTypes	Description
String	repeat(String str, Int n)	将 str 重复 n 次

0: jdbc:hive2://192.168.150.150:10000> select repeat('LOVE ', 4);
+-----------------------+
|          _c0          |
+-----------------------+
| LOVE LOVE LOVE LOVE   |
+-----------------------+
1 row selected (0.128 seconds)

reverse()

Operator	OperandTypes	Description
String	reverse(string A)	返回 A 的反向字符串

0: jdbc:hive2://192.168.150.150:10000> select reverse('LOVE'),reverse('EVOL');
+-------+-------+
|  _c0  |  _c1  |
+-------+-------+
| EVOL  | LOVE  |
+-------+-------+
1 row selected (0.5 seconds)

rpad(string str, int len, string pad)

Operator	OperandTypes	Description
String	rpad(String str, Int len, String pad)	返回在 str 的右边填充了一个长度为 len 的 pad【如果 str 长于 len，则返回值缩短为 len 字符。对于空填充字符串，返回值为空】

0: jdbc:hive2://192.168.150.150:10000> select rpad('LOVE', 3, ' '),rpad('LOVE', 10, ' YOU');
+------+-------------+
| _c0  |     _c1     |
+------+-------------+
| LOV  | LOVE YOU Y  |
+------+-------------+
1 row selected (0.105 seconds)

sentences()

Operator	OperandTypes	Description
array<array>	sentences(String str, [String lang, String locale])	返回参数 str 分词后的单词二维数组【将str的标点符号和空格作为第1、2个维度为分割符,lang 和 locale 为可选参数】

0: jdbc:hive2://192.168.150.150:10000> select sentences('hello word!Hello Hive,Hello,I LOVE YOU');
+----------------------------------------------------+
|                        _c0                         |
+----------------------------------------------------+
| [["hello","word"],["Hello","Hive","Hello","I","LOVE","YOU"]] |
+----------------------------------------------------+
1 row selected (0.133 seconds)

——ngrams()

Operator	OperandTypes	Description
array<struct<string,double>>	ngrams(array<array>, int N, int K, int pf)	与 sentences()函数一起使用,分词后,返回arr中连续的n个单词的词频统计结果【针对arr中连续n个单词做词频统计并倒序排列，将topk个统计结果返回】

1	`select ngrams(sentences('hello word!Hello Hive,Hello,I LOVE YOU'),1,1);`

—–context_ngrams()

Operator	OperandTypes	Description
Array<struct<String,Double>>	context_ngrams(Array<Array> arr, Array cnt, Int K, Int pf)	返回arr中连续的size(cnt)个单词组合以cnt中非null的内容匹配统计，按数量到处排列，将K个统计结果返回返回arr中连续的size(cnt)个单词组合以cnt中非null内容匹配统计，按数量倒序排列，将topk个统计结果返回

space()

Operator	OperandTypes	Description
String	space( Int n )	返回 n 个空格的字符串

0: jdbc:hive2://192.168.150.150:10000> select space(0), space(4),length(space(0)), length(space(4));
+------+-------+------+------+
| _c0  |  _c1  | _c2  | _c3  |
+------+-------+------+------+
|      |       | 0    | 4    |
+------+-------+------+------+
1 row selected (0.165 seconds)

split()

Operator	OperandTypes	Description
Array	split(String str, String pat\|parten)	返回按 pat【正则也可】内容分割的 str 返回的数组

0: jdbc:hive2://192.168.150.150:10000> select split('ILOVEYOU','LOVE');
+--------------+
|     _c0      |
+--------------+
| ["I","YOU"]  |
+--------------+
1 row selected (0.178 seconds)
0: jdbc:hive2://192.168.150.150:10000> select split('1234556sdfasd1231234','[a-z]+');
+------------------------+
|          _c0           |
+------------------------+
| ["1234556","1231234"]  |
+------------------------+
1 row selected (0.561 seconds)

str_to_map()

Operator	OperandTypes	Description
Map<String，String>	str_to_map(Text[, Delimiter1, Delimiter2])	返回使用两个分隔符将文本拆分的键值对【Delimiter1 将 Text 分隔为 K-V 对，Delimiter2 将每个 K-V 对分隔。默认分隔符是’ , ‘表示分隔符1，’ : ‘表示分隔符2】

0: jdbc:hive2://192.168.150.150:10000> select str_to_map('name:洋群满满,age:18');
+-----------------------------+
|             _c0             |
+-----------------------------+
| {"name":"洋群满满","age":"18"}  |
+-----------------------------+
1 row selected (0.252 seconds)

substr/substring()

Operator	OperandTypes	Description
String	substr/substring(string\|binary A, int start)	返回字符串 A 的字节数组的子字符串或片段，从开始位置开始直到字符串 A 的结束,
String	substr/substring(string\|binary A, int start, int len)	返回 a 的字节数组的子字符串或片段，从起始位置开始，长度为 len。

0: jdbc:hive2://192.168.150.150:10000> select substr('洋群满满',2);
+------+
| _c0  |
+------+
| 群满满  |
+------+
1 row selected (0.133 seconds)
0: jdbc:hive2://192.168.150.150:10000> select substring('洋群满满',2);
+------+
| _c0  |
+------+
| 群满满  |
+------+
1 row selected (0.498 seconds)
0: jdbc:hive2://192.168.150.150:10000> select substr('洋群满满',2,1);
+------+
| _c0  |
+------+
| 群    |
+------+
1 row selected (0.101 seconds)
0: jdbc:hive2://192.168.150.150:10000> select substring('洋群满满',2,1);
+------+
| _c0  |
+------+
| 群    |
+------+
1 row selected (0.122 seconds)

substring_index()

Operator	OperandTypes	Description
String	substring_index(string A, string delim, int count)	返回字符串 A 中分隔符 delim 在计数count之前出现的子字符串【count越界则返回A】

1 row selected (0.494 seconds)
0: jdbc:hive2://192.168.150.150:10000> select substring_index('洋群,满,满,',',',3);
+---------+
|   _c0   |
+---------+
| 洋群,满,满  |
+---------+
1 row selected (0.095 seconds)
0: jdbc:hive2://192.168.150.150:10000> select substring_index('洋群,满,满,',',',2);
+-------+
|  _c0  |
+-------+
| 洋群,满  |
+-------+
1 row selected (0.468 seconds)
0: jdbc:hive2://192.168.150.150:10000> select substring_index('洋群,满,满',',',50);
+---------+
|   _c0   |
+---------+
| 洋群,满,满  |
+---------+
1 row selected (0.081 seconds)

translate()

Operator	OperandTypes	Description
String	translate(string\|char\|varchar Input, string\|char\|varchar From, string\|char\|varchar To)	将 From 字符串中的字符替换为 To 字符串中的相应字符，从而转换输入字符串。这类似于 PostgreSQL 中的 translate 函数。如果该 UDF 的任何参数都是 NULL，那么结果也是 NULL。

0: jdbc:hive2://192.168.150.150:10000> select translate('洋群满满','洋','羊');
+-------+
|  _c0  |
+-------+
| 羊群满满  |
+-------+
1 row selected (0.092 seconds)
# 一个字符的From,只会替换一个字符的To
0: jdbc:hive2://192.168.150.150:10000> select translate('洋群满满','洋','yang');
+-------+
|  _c0  |
+-------+
| y群满满  |
+-------+
1 row selected (0.09 seconds)
0: jdbc:hive2://192.168.150.150:10000> select translate('洋群满满','洋群','yang');
+-------+
|  _c0  |
+-------+
| ya满满  |
+-------+
1 row selected (0.491 seconds)

trim()

Operator	OperandTypes	Description
String	trim(string A)	去除A两端的空格

0: jdbc:hive2://192.168.150.150:10000> select length(' 洋群满满 ');
+------+
| _c0  |
+------+
| 6    |
+------+
1 row selected (0.087 seconds)
0: jdbc:hive2://192.168.150.150:10000> select length(trim(' 洋群满满 '));
+------+
| _c0  |
+------+
| 4    |
+------+
1 row selected (0.096 seconds)

upper()/ucase()

Operator	OperandTypes	Description
String	upper(String A)	返回将 A 的所有字符转换为大写形式所得到的字符串
String	ucase(String A)	返回将 A 的所有字符转换为大写形式所得到的字符串

0: jdbc:hive2://192.168.150.150:10000> select upper('yqmm');
+-------+
|  _c0  |
+-------+
| YQMM  |
+-------+
1 row selected (0.485 seconds)
0: jdbc:hive2://192.168.150.150:10000> select ucase('yqmm');
+-------+
|  _c0  |
+-------+
| YQMM  |
+-------+
1 row selected (0.073 seconds)

initcap()

Operator	OperandTypes	Description
String	Initcap(string A)	返回字符串，每个单词的第一个大写字母为大写，其他所有字母为小写。【单词由空格分隔】

0: jdbc:hive2://192.168.150.150:10000> select initcap('yqmm');
+-------+
|  _c0  |
+-------+
| Yqmm  |
+-------+
1 row selected (0.096 seconds)
0: jdbc:hive2://192.168.150.150:10000> select initcap('yqmm i love you');
+------------------+
|       _c0        |
+------------------+
| Yqmm I Love You  |
+------------------+
1 row selected (0.065 seconds)

levenshtein()

Operator	OperandTypes	Description
Int	levenshtein(string A, string B)	返回两个字符串之间的莱文斯坦距离【Levenshtein 距离，又称编辑距离，指的是两个字符串之间，由一个转换成另一个所需的最少编辑操作次数。许可的编辑操作包括将一个字符替换成另一个字符，插入一个字符，删除一个字符。编辑距离的算法是首先由俄国科学家Levenshtein提出的，故又叫Levenshtein Distance。】【用途：模糊查询】

0: jdbc:hive2://192.168.150.150:10000> select levenshtein('yq','mm');
+------+
| _c0  |
+------+
| 2    |
+------+
1 row selected (0.08 seconds)
0: jdbc:hive2://192.168.150.150:10000> select levenshtein('y','qmm');
+------+
| _c0  |
+------+
| 3    |
+------+
1 row selected (0.11 seconds)

soundex()

Operator	OperandTypes	Description
String	soundex(string A)	返回字符串的 soundex 代码【soundex是一个将任何文本串转换为描述其语音表示的字母数字模式的算法。soundex考虑了类似的发音字符和音节，使得对字符串进行发音比较而不是字母比较。虽然，soundex不是SQL概念，但是多数DBMS都提供对soundex的支持。】

0: jdbc:hive2://192.168.150.150:10000> select soundex('洋群满满');
+-------+
|  _c0  |
+-------+
| NULL  |
+-------+
1 row selected (0.484 seconds)
0: jdbc:hive2://192.168.150.150:10000> select soundex('yangqunmanman');
+-------+
|  _c0  |
+-------+
| Y525  |
+-------+
1 row selected (0.096 seconds)
0: jdbc:hive2://192.168.150.150:10000> select soundex('hive');
+-------+
|  _c0  |
+-------+
| H100  |
+-------+
1 row selected (0.482 seconds)

Data Masking Functions：数据屏蔽函数/脱敏函数

mask()

Operator	OperandTypes	Description
String	mask(string Str[, string upper[, string lower[, string number]]])	返回 str 的掩码版本。默认情况下，大写字母转换为“ X”，小写字母转换为“ x”，数字转换为“ n”。通过提供附加参数，可以覆盖掩码中使用的字符: 第二个参数控制大写字母的掩码字符，第三个参数控制小写字母，第四个参数控制数字。

0: jdbc:hive2://192.168.150.150:10000> select mask("abcd-EFGH-8765-4321");
+----------------------+
|         _c0          |
+----------------------+
| xxxx-XXXX-nnnn-nnnn  |
+----------------------+
1 row selected (0.527 seconds)
0: jdbc:hive2://192.168.150.150:10000> select mask("abcd-EFGH-8765-4321", "U", "l", "#");
+----------------------+
|         _c0          |
+----------------------+
| llll-UUUU-####-####  |
+----------------------+
1 row selected (0.068 seconds)

mask_first_n()

Operator	OperandTypes	Description
String	mask_first_n(string Str[, int N])	返回带有前 N 个掩码值的掩码 Str 的掩码版本【大写字母转换为“X”，小写字母转换为“ x”，数字转换为“ n”】

0: jdbc:hive2://192.168.150.150:10000> select mask_first_n("abcd-EFGH-8765-4321", 4);
+----------------------+
|         _c0          |
+----------------------+
| xxxx-EFGH-8765-4321  |
+----------------------+
1 row selected (0.083 seconds)
0: jdbc:hive2://192.168.150.150:10000> select mask_first_n("abcd-EFGH-8765-4321", 6);
+----------------------+
|         _c0          |
+----------------------+
| xxxx-XFGH-8765-4321  |
+----------------------+
1 row selected (0.498 seconds)
0: jdbc:hive2://192.168.150.150:10000> select mask_first_n("abcd-EFGH-8765-4321", 8);
+----------------------+
|         _c0          |
+----------------------+
| xxxx-XXXH-8765-4321  |
+----------------------+
1 row selected (0.077 seconds)

mask_last_n()

Operator	OperandTypes	Description
String	mask_last_n(string Str[, int N])	返回带有最后 N 个掩码值的掩码 Str 的掩码版本【大写字母转换为“X”，小写字母转换为“ x”，数字转换为“ n”】

0: jdbc:hive2://192.168.150.150:10000> select mask_last_n("abcd-EFGH-8765-4321", 4);
+----------------------+
|         _c0          |
+----------------------+
| abcd-EFGH-8765-nnnn  |
+----------------------+
1 row selected (0.082 seconds)
0: jdbc:hive2://192.168.150.150:10000> select mask_last_n("abcd-EFGH-8765-4321", 6);
+----------------------+
|         _c0          |
+----------------------+
| abcd-EFGH-876n-nnnn  |
+----------------------+
1 row selected (0.085 seconds)
0: jdbc:hive2://192.168.150.150:10000> select mask_last_n("abcd-EFGH-8765-4321", 8);
+----------------------+
|         _c0          |
+----------------------+
| abcd-EFGH-8nnn-nnnn  |
+----------------------+
1 row selected (0.479 seconds)

mask_show_first_n()

Operator	OperandTypes	Description
String	mask_show_first_n(string Str[, int N])	返回 Str 的屏蔽版本，显示前 N 个未屏蔽字符【大写字母转换为“X”，小写字母转换为“ x”，数字转换为“ n”】

0: jdbc:hive2://192.168.150.150:10000> select mask_show_first_n("1234-5678-8765-4321", 4);
+----------------------+
|         _c0          |
+----------------------+
| 1234-nnnn-nnnn-nnnn  |
+----------------------+
1 row selected (0.08 seconds)
0: jdbc:hive2://192.168.150.150:10000> select mask_show_first_n("1234-5678-8765-4321", 6);
+----------------------+
|         _c0          |
+----------------------+
| 1234-5nnn-nnnn-nnnn  |
+----------------------+
1 row selected (0.484 seconds)

mask_show_last_n()

Operator	OperandTypes	Description
String	mask_show_first_n(string Str[, int N])	返回 Str 的屏蔽版本，显示最后 N个未屏蔽字符【大写字母转换为“X”，小写字母转换为“ x”，数字转换为“n”】

0: jdbc:hive2://192.168.150.150:10000> select mask_show_last_n("1234-5678-8765-4321", 4);
+----------------------+
|         _c0          |
+----------------------+
| nnnn-nnnn-nnnn-4321  |
+----------------------+
1 row selected (0.497 seconds)
0: jdbc:hive2://192.168.150.150:10000> select mask_show_last_n("1234-5678-8765-4321", 6);
+----------------------+
|         _c0          |
+----------------------+
| nnnn-nnnn-nnn5-4321  |
+----------------------+
1 row selected (0.066 seconds)

mask_hash()

Operator	OperandTypes	Description
String	mask_hash(string\|char\|varchar Str)	对str进行hash操作并返回（从Hive 2.1.0开始）。散列是一致的，可用于跨表将masked值连接在一起。对于非string类型的值，此函数返回null。

0: jdbc:hive2://192.168.150.150:10000> select mask_hash('洋群满满');
+----------------------------------------------------+
|                        _c0                         |
+----------------------------------------------------+
| 7425da882f430537e1fa310dee2663b6dceac12ee9dba061ec9c61d4ddb89b0a |
+----------------------------------------------------+
1 row selected (0.084 seconds)
# 散列是一致的，可用于跨表将masked值连接在一起
0: jdbc:hive2://192.168.150.150:10000> select mask_hash('洋群满满');
+----------------------------------------------------+
|                        _c0                         |
+----------------------------------------------------+
| 7425da882f430537e1fa310dee2663b6dceac12ee9dba061ec9c61d4ddb89b0a |
+----------------------------------------------------+
1 row selected (0.495 seconds)
0: jdbc:hive2://192.168.150.150:10000> select mask_hash(1234);
+-------+
|  _c0  |
+-------+
| NULL  |
+-------+
1 row selected (0.733 seconds)

Misc Functions：加/解密函数

java_method()/reflect()

Operator	OperandTypes	Description
varies	java_method(Class, Method[, arg1[, arg2..]])	使用反射机制调用java的方法【与reflect作用一样】
varies	reflect(Class, Method[, arg1[, arg2..]])	使用反射机制调用java的方法【与java_method作用一样】

0: jdbc:hive2://192.168.150.150:10000> SELECT reflect("java.lang.Math", "max", 1,9);
+------+
| _c0  |
+------+
| 9    |
+------+
1 row selected (0.511 seconds)
# 当然了，把它归于加/解密函数中，调用的当然不是这么简单的java.lang.math方法，而是更加容易去加/解密的方法
0: jdbc:hive2://192.168.150.150:10000> SELECT java_method("java.lang.Math", "max", 1,9);
+------+
| _c0  |
+------+
| 9    |
+------+
1 row selected (0.483 seconds

hash()

Operator	OperandTypes	Description
Int	hash(a1[, a2…])	返回参数的hash散列值

0: jdbc:hive2://192.168.150.150:10000> select hash('洋群满满');
+-------------+
|     _c0     |
+-------------+
| -391132048  |
+-------------+
1 row selected (0.079 seconds)
0: jdbc:hive2://192.168.150.150:10000> select hash(11,22);
+------+
| _c0  |
+------+
| 363  |
+------+
1 row selected (0.079 seconds)

current_user()

Operator	OperandTypes	Description
String	current_user()	从配置的身份验证器管理器返回当前用户名【.可能与连接时提供的用户相同，但是对于某些身份验证管理器(例如 :HadoopDefaultAuthenticatorr)它可能不同】

0: jdbc:hive2://192.168.150.150:10000> select current_user();
+------------+
|    _c0     |
+------------+
| anonymous  |    # 匿名
+------------+
1 row selected (0.489 seconds)

logged_in_user()

Operator	OperandTypes	Description
String	logged_in_user()	从会话状态返回当前用户名,就是连接到 Hive 时提供的用户名

0: jdbc:hive2://192.168.150.150:10000> select logged_in_user();
+------------+
|    _c0     |
+------------+
| anonymous  |	 # 匿名
+------------+
1 row selected (0.092 seconds)

current_database()

Operator	OperandTypes	Description
String	current_database()	返回当前数据库名称

0: jdbc:hive2://192.168.150.150:10000> select current_database();
+----------+
|   _c0    |
+----------+
| default  |
+----------+
1 row selected (0.079 seconds)

md5()

Operator	OperandTypes	Description
String	md5( String / binary str)	返回将参数转换为MD5加密后的字符串【结果不可逆】,计算字符串或二进制的 MD5 128位校验,该值作为一个包含32个十六进制数字的字符串返回，如果参数为 NULL，则返回 NULL.

0: jdbc:hive2://192.168.150.150:10000> select md5('LOVE');
+-----------------------------------+
|                _c0                |
+-----------------------------------+
| c9122fd7bae0681b62a39ddfc1c7fb19  |
+-----------------------------------+
1 row selected (0.061 seconds)
0: jdbc:hive2://192.168.150.150:10000> select md5('洋群满满');
+-----------------------------------+
|                _c0                |
+-----------------------------------+
| e92c7e61076ad578ee89930f63d9b325  |
+-----------------------------------+
1 row selected (0.083 seconds)

sha1()/sha()

Operator	OperandTypes	Description
String	sha1(string/binary)	为字符串或二进制文件计算 SHA-1摘要，并以十六进制字符串
String	sha(string/binary)	为字符串或二进制文件计算 SHA摘要，并以十六进制字符串

0: jdbc:hive2://192.168.150.150:10000> select sha1('洋群满满');
+-------------------------------------------+
|                    _c0                    |
+-------------------------------------------+
| a1e638c93d7286770a5c79134f0db146ee77dabc  |
+-------------------------------------------+
1 row selected (0.082 seconds)
0: jdbc:hive2://192.168.150.150:10000> select sha('洋群满满');
+-------------------------------------------+
|                    _c0                    |
+-------------------------------------------+
| a1e638c93d7286770a5c79134f0db146ee77dabc  |
+-------------------------------------------+
1 row selected (0.469 seconds)

crc32()

Operator	OperandTypes	Description
BigInt	crc32(string/binary)	计算字符串或二进制参数的循环冗余校验值并返回 bigint 值

0: jdbc:hive2://192.168.150.150:10000> select crc32('洋群满满');
+-------------+
|     _c0     |
+-------------+
| 1515698111  |
+-------------+
1 row selected (0.089 seconds)
0: jdbc:hive2://192.168.150.150:10000> select crc32(010111);
+------------+
|    _c0     |
+------------+
| 409081509  |
+------------+
1 row selected (0.066 seconds)

sha2()

Operator	OperandTypes	Description
String	sha2(string/binary, int)	计算SHA-2系列哈希函数(SHA-224, SHA-256, SHA-384, and SHA-512)（从Hive1.3.0开始）。第一个参数是要hash的字符串或二进制。第二个参数表示结果所需的位长度，其值必须为224、256、384、512或0（相当于256）。从Java 8开始支持SHA-224。如果参数为空或哈希长度不是允许的值之一，则返回值为NULL

0: jdbc:hive2://192.168.150.150:10000> select sha2('洋群满满',0);
+----------------------------------------------------+
|                        _c0                         |
+----------------------------------------------------+
| 7425da882f430537e1fa310dee2663b6dceac12ee9dba061ec9c61d4ddb89b0a |
+----------------------------------------------------+
1 row selected (0.084 seconds)
0: jdbc:hive2://192.168.150.150:10000> select sha2('洋群满满',256);
+----------------------------------------------------+
|                        _c0                         |
+----------------------------------------------------+
| 7425da882f430537e1fa310dee2663b6dceac12ee9dba061ec9c61d4ddb89b0a |
+----------------------------------------------------+
1 row selected (0.119 seconds)
0: jdbc:hive2://192.168.150.150:10000> select sha2('洋群满满',512);
+----------------------------------------------------+
|                        _c0                         |
+----------------------------------------------------+
| f7fd48e88a6dae6f39f1b622d8d82eecb6ead9a354bff2364cc790d46e04159704ddceae09b6afde86b53dc6cfed1a9cf2d17ee825dff4c6f09648703f746207 |
+----------------------------------------------------+
1 row selected (0.491 seconds)

aes_encrypt()

Operator	OperandTypes	Description
Binary	aes_encrypt(Input string/binary, Key string/binary)	使用AES对Input加密，可以使用128、192或256位的密钥长度。如果安装了 Java 加密扩展(JCE)无限强度管辖权策略文件，则可以使用192和256位密钥。如果任一参数为 NULL 或Key长度不是允许的值之一，则返回值为 NULL。

0: jdbc:hive2://192.168.150.150:10000> select aes_encrypt('洋群满满','520');
+-------+
|  _c0  |
+-------+
| NULL  |
+-------+
1 row selected (0.128 seconds)
0: jdbc:hive2://192.168.150.150:10000> select aes_encrypt('洋群满满','1234567890123456');
+-------------------+
|        _c0        |
+-------------------+
| ��@��gE3,��$�D�  |
+-------------------+
1 row selected (0.08 seconds)
0: jdbc:hive2://192.168.150.150:10000> select base64(aes_encrypt('洋群满满','1234567890123456'));
+---------------------------+
|            _c0            |
+---------------------------+
| xstA29dnRTMCLJTeJKlEzw==  |
+---------------------------+
1 row selected (0.073 seconds)

aes_decrypt()

Operator	OperandTypes	Description
Binary	aes_decrypt(Input binary, Key string/binary)	使用AES对Input进行解密，可以使用128、192或256位的密钥长度。如果安装了 Java 加密扩展(JCE)无限强度管辖权策略文件，则可以使用192和256位密钥。如果任一参数为 NULL 或Key长度不是允许的值之一，则返回值为 NULL。

0: jdbc:hive2://192.168.150.150:10000> select aes_decrypt(unbase64('xstA29dnRTMCLJTeJKlEzw=='), '1234567890123456'); 
+-------+
|  _c0  |
+-------+
| 洋群满满  |
+-------+
1 row selected (0.093 seconds)

version()

Operator	OperandTypes	Description
String	version()	返回 Hive 版本,该字符串包含两个字段，第一个字段是生成编号，第二个字段是生成的hash散列。例如: “ select version () ;实际的结果将取决于你的构建

0: jdbc:hive2://192.168.150.150:10000> select version();
+--------------------------------------------------+
|                       _c0                        |
+--------------------------------------------------+
| 3.1.2 r8190d2be7b7165effa62bd21b7d60ef81fb0e4af  |
+--------------------------------------------------+
1 row selected (0.089 seconds)

surrogate_key()

Operator	OperandTypes	Description
BigInt	surrogate_key([write_id_bits, task_id_bits])	在向表中输入数据时，自动为行生成数字 id。只能用作 acid 或只插入表的默认值

Built-in Aggregate Functions ：内置聚合函数(UDAF)—多对一

count()

Operator	OperandTypes	Description
BigInt	count(*)	返回检索到的行的总数，包括包含 NULL 值的行。
BigInt	count(expr)	返回提供的表达式为非 null 的行数。
BigInt	count(DISTINCT expr[, expr…])	返回提供的表达式唯一且非 null 的行数。可以使用 `hive.optimize.distinct.rewrite` 来优化这个操作。

sum()

Operator	OperandTypes	Description
Double	sum(col)	对组内某列求和（包含重复值）
Double	sum(DISTINCT col)	对组内某列求和（不包含重复值）

avg()

Operator	OperandTypes	Description
Double	avg(col)	对组内某列求平均值（包含重复值）
Double	avg(DISTINCT col)	对组内某列求平均值（不包含重复值）

min()/max()

Operator	OperandTypes	Description
Double	min(col)	返回组中列的最小值
Double	max(col)	返回组中列的最大值

variance()/var_pop()

Operator	OperandTypes	Description
Double	variance(col)	返回组中数值列的方差
Double	var_pop(col)	返回组中数值列的方差

var_samp()

Operator	OperandTypes	Description
Double	var_samp(col)	返回组中一个数值列的无偏样本方差

stddev_pop()

Operator	OperandTypes	Description
Double	stddev_pop(col)	返回组中一个数值列的标准差

stddev_samp()

Operator	OperandTypes	Description
Double	stddev_samp(col)	返回组中一个数值列的无偏样本标准差

covar_pop()

Operator	OperandTypes	Description
Double	covar_pop(col1, col2)	返回组中一对数值列的总体协方差

covar_samp()

Operator	OperandTypes	Description
Double	covar_samp(col1, col2)	返回组中一对数值列的样本协方差

corr()

Operator	OperandTypes	Description
Double	corr(col1, col2)	返回组中一对数字列的皮尔逊相关系数

percentile()

Operator	OperandTypes	Description
Double	percentile(BIGINT col, p)	返回组中某一列的精确位数【第p位百分数】(不适用于浮点类型)。p 一定在0和1之间。注意: 只能为整数值计算真正的百分位数。如果您的输入是非整数的，请使用 percentile_approx
Array<Double>	percentile(BIGINT col, array(p1 [, p2]…))	返回组中某列的精确百分位数 p1、 p2、 … (不适用于浮点类型)。p一定在0和1之间。注意: 只能为整数值计算真正的百分位数。如果您的输入是非整数的，请使用 PERCENTILE _ approx。

percentile_approx()

Operator	OperandTypes	Description
Double	percentile_approx(DOUBLE col, p [, B])	返回组中数值列(包括浮点类型)的近似百分位数。B 参数以内存为代价来控制近似精度。更高的值产生更好的近似值，默认值是10,000。当 col 中不同值的数目小于 B 时，这就给出了一个精确的百分比值。
Array<Double>	percentile_approx(DOUBLE col, array(p1 [, p2]…) [, B])	与上面相同，但是接受并返回百分位数值数组而不是单个百分位数值数组。

regr_avgx()

Operator	OperandTypes	Description
Double	regr_avgx(independent, dependent)	相当于`avg(dependent)`:计算自变量的平均值。该函数将任意一对数字类型作为参数，并返回一个double。任何具有null的对都将被忽略。如果应用于空集：返回null。

regr_avgy()

Operator	OperandTypes	Description
Double	regr_avgy(independent, dependent)	相当于`avg(independent)`:计算因变量的平均值。该函数将任意一对数字类型作为参数，并返回一个double。任何具有null的对都将被忽略。如果应用于空集：返回null

regr_count()

Operator	OperandTypes	Description
Double	regr_count(independent, dependent)	返回用于匹配线性回归线的非空对的数量。【返回independent和dependent都非空的对数】

regr_intercept()

Operator	OperandTypes	Description
Double	regr_intercept(independent, dependent)	返回线性回归线的 y 轴截距，也就是方程`a * independent+ b` 中 b 的值【返回线性回归的截距项】

regr_r2()

Operator	OperandTypes	Description
Double	regr_r2(independent, dependent)	返回线性回归的确定系数

regr_slope()

Operator	OperandTypes	Description
Double	regr_slope(independent, dependent)	返回线性回归的斜率，也就是依赖于方程 `a * independent+ b` 中的 a 的值

regr_sxx()

Operator	OperandTypes	Description
Double	regr_sxx(independent, dependent)	等价于regr_count(independent, dependent) * var_pop(dependent)。【用如下公式计算： ∑ i = 1 n x i x i − ∑ i = 1 n x i ∗ ∑ i = 1 n x i n \frac{\sum_{i=1}^nx_ix_i-\sum_{i=1}^nx_i\ast\sum_{i=1}^nx_i}n n∑i=1nxixi−∑i=1nxi∗∑i=1nxi，并返回结果。】

regr_sxy()

Operator	OperandTypes	Description
Double	regr_sxy(independent, dependent)	等价于regr_count(independent, dependent) * covar_pop(independent, dependent)。【用如下公式计算： ∑ i = 1 n y i x i − ∑ i = 1 n y i ∗ ∑ i = 1 n x i n \frac{\sum_{i=1}^ny_ix_i-\sum_{i=1}^ny_i\ast\sum_{i=1}^nx_i}n n∑i=1nyixi−∑i=1nyi∗∑i=1nxi，并返回结果。】

regr_syy()

Operator	OperandTypes	Description
Double	regr_syy(independent, dependent)	等价于 regr_count(independent, dependent) * var_pop(independent)。【用如下公式计算： ∑ i = 1 n y i y i − ∑ i = 1 n y i ∗ ∑ i = 1 n y i n \frac{\sum_{i=1}^ny_iy_i-\sum_{i=1}^ny_i\ast\sum_{i=1}^ny_i}n n∑i=1nyiyi−∑i=1nyi∗∑i=1nyi，并返回结果。】

histogram_numeric()

Operator	OperandTypes	Description
Array<struct {‘x’,’y’}>	histogram_numeric(col, b)	用于画直方图。返回一个长度为b的数组，数组中元素为(x,y)形式的键值对，x代表了直方图中该柱形的中心，y代表可其高度。

collect_set()

Operator	OperandTypes	Description
Array	collect_set(col)	返回查询列col去重后的集合，与distinct不同，distinct查询结果为一列数据，collect_set查询后结果为一个集合形式的元素

collect_listl)

Operator	OperandTypes	Description
Array	collect_list(col)	返回查询列col的列表

ntile()

Operator	OperandTypes	Description
INTEGER	ntile(INTEGER x)	将有序分区划分为x个称为存储桶的组，并为该分区中的每一行分配存储桶编号。（此方式存储可以快速计算分位数）

Built-in Table-Generating Functions (UDTF)：内置的表生成函数(UDTF)—一对多

一般的用户定义函数，如 concat () ，接受单个输入行并输出单个输出行。相反，表生成函数将单个输入行转换为多个输出行。

explode()

Operator	OperandTypes	Description
T	explode(ARRAY a)	将数组爆炸为多行。返回一个带有单列(col)的行集，该数组中的每个元素对应一行。
Tkey,Tvalue	explode(MAP<Tkey,Tvalue> m)	将map爆炸为多行。返回一个包含两列(键、值)的行集，输入映射中的每个键值对都有一行

0: jdbc:hive2://192.168.150.150:10000> select explode(array('A','B','C'));
+------+
| col  |
+------+
| A    |
| B    |
| C    |
+------+
3 rows selected (0.252 seconds)
0: jdbc:hive2://192.168.150.150:10000> select explode(array('A','B','C')) as col;
+------+
| col  |
+------+
| A    |
| B    |
| C    |
+------+
3 rows selected (0.314 seconds)
0: jdbc:hive2://192.168.150.150:10000> select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf;
+---------+
| tf.col  |
+---------+
| A       |
| B       |
| C       |
+---------+
3 rows selected (0.046 seconds)
0: jdbc:hive2://192.168.150.150:10000> select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf as col;
+---------+
| tf.col  |
+---------+
| A       |
| B       |
| C       |
+---------+
3 rows selected (0.043 seconds)
0: jdbc:hive2://192.168.150.150:10000> select explode(map('A',10,'B',20,'C',30));
+------+--------+
| key  | value  |
+------+--------+
| A    | 10     |
| B    | 20     |
| C    | 30     |
+------+--------+
3 rows selected (0.309 seconds)
0: jdbc:hive2://192.168.150.150:10000> select explode(map('A',10,'B',20,'C',30)) as (key,value);
+------+--------+
| key  | value  |
+------+--------+
| A    | 10     |
| B    | 20     |
| C    | 30     |
+------+--------+
3 rows selected (0.639 seconds)
0: jdbc:hive2://192.168.150.150:10000> select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf;
+---------+-----------+
| tf.key  | tf.value  |
+---------+-----------+
| A       | 10        |
| B       | 20        |
| C       | 30        |
+---------+-----------+
3 rows selected (0.048 seconds)
0: jdbc:hive2://192.168.150.150:10000> select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf as key,value;
+---------+-----------+
| tf.key  | tf.value  |
+---------+-----------+
| A       | 10        |
| B       | 20        |
| C       | 30        |
+---------+-----------+
3 rows selected (0.299 seconds)

posexplode()

Operator	OperandTypes	Description
Int,T	posexplode(ARRAY a)	使用 int 类型的附加位置列（原始数组中项目的位置，从 0 开始）将数组分解为多行。返回一个包含两列 (pos,val) 的行集，数组中的每个元素占一行。

0: jdbc:hive2://192.168.150.150:10000> select posexplode(array('A','B','C'));
+------+------+
| pos  | val  |
+------+------+
| 0    | A    |
| 1    | B    |
| 2    | C    |
+------+------+
3 rows selected (0.674 seconds)
0: jdbc:hive2://192.168.150.150:10000> select posexplode(array('A','B','C')) as (pos,val);
+------+------+
| pos  | val  |
+------+------+
| 0    | A    |
| 1    | B    |
| 2    | C    |
+------+------+
3 rows selected (0.711 seconds)
0: jdbc:hive2://192.168.150.150:10000> select tf.* from (select 0) t lateral view posexplode(array('A','B','C')) tf;
+---------+---------+
| tf.pos  | tf.val  |
+---------+---------+
| 0       | A       |
| 1       | B       |
| 2       | C       |
+---------+---------+
3 rows selected (0.042 seconds)
0: jdbc:hive2://192.168.150.150:10000> select tf.* from (select 0) t lateral view posexplode(array('A','B','C')) tf as pos,val;
+---------+---------+
| tf.pos  | tf.val  |
+---------+---------+
| 0       | A       |
| 1       | B       |
| 2       | C       |
+---------+---------+
3 rows selected (0.253 seconds)

inline()

Operator	OperandTypes	Description
T1,…,Tn	inline(ARRAY<STRUCT<f1:T1,…,fn:Tn>> a)	将结构数组分解为多行。返回一个包含 N 列的行集（N = 结构中顶级元素的数量），数组中每个结构一行。

0: jdbc:hive2://192.168.150.150:10000> select inline(array(struct('A',10,date '2015-01-01'),struct('B',20,date '2016-02-02')));
+-------+-------+-------------+
| col1  | col2  |    col3     |
+-------+-------+-------------+
| A     | 10    | 2015-01-01  |
| B     | 20    | 2016-02-02  |
+-------+-------+-------------+
2 rows selected (0.356 seconds)
0: jdbc:hive2://192.168.150.150:10000> select inline(array(struct('A',10,date '2015-01-01'),struct('B',20,date '2016-02-02'))) as (col1,col2,col3);
+-------+-------+-------------+
| col1  | col2  |    col3     |
+-------+-------+-------------+
| A     | 10    | 2015-01-01  |
| B     | 20    | 2016-02-02  |
+-------+-------+-------------+
2 rows selected (0.232 seconds)
0: jdbc:hive2://192.168.150.150:10000> select tf.* from (select 0) t lateral view inline(array(struct('A',10,date '2015-01-01'),struct('B',20,date '2016-02-02'))) tf;
+----------+----------+-------------+
| tf.col1  | tf.col2  |   tf.col3   |
+----------+----------+-------------+
| A        | 10       | 2015-01-01  |
| B        | 20       | 2016-02-02  |
+----------+----------+-------------+
2 rows selected (0.468 seconds)
0: jdbc:hive2://192.168.150.150:10000> select tf.* from (select 0) t lateral view inline(array(struct('A',10,date '2015-01-01'),struct('B',20,date '2016-02-02'))) tf as col1,col2,col3;
+----------+----------+-------------+
| tf.col1  | tf.col2  |   tf.col3   |
+----------+----------+-------------+
| A        | 10       | 2015-01-01  |
| B        | 20       | 2016-02-02  |
+----------+----------+-------------+
2 rows selected (0.379 seconds)

stack()

Operator	OperandTypes	Description
T1,…,Tn/r	stack(int r,T1 V1,…,Tn/r Vn)	将 n 个值 V1,…,Vn 分解为 r 行。每行将有 n/r 列。 r 必须是常数。

0: jdbc:hive2://192.168.150.150:10000> select stack(2,'A',10,date '2015-01-01','B',20,date '2016-01-01');
+-------+-------+-------------+
| col0  | col1  |    col2     |
+-------+-------+-------------+
| A     | 10    | 2015-01-01  |
| B     | 20    | 2016-01-01  |
+-------+-------+-------------+
2 rows selected (0.3 seconds)
0: jdbc:hive2://192.168.150.150:10000> select stack(2,'A',10,date '2015-01-01','B',20,date '2016-01-01') as (col0,col1,col2);
+-------+-------+-------------+
| col0  | col1  |    col2     |
+-------+-------+-------------+
| A     | 10    | 2015-01-01  |
| B     | 20    | 2016-01-01  |
+-------+-------+-------------+
2 rows selected (0.252 seconds)
0: jdbc:hive2://192.168.150.150:10000> select tf.* from (select 0) t lateral view stack(2,'A',10,date '2015-01-01','B',20,date '2016-01-01') tf;
+----------+----------+-------------+
| tf.col0  | tf.col1  |   tf.col2   |
+----------+----------+-------------+
| A        | 10       | 2015-01-01  |
| B        | 20       | 2016-01-01  |
+----------+----------+-------------+
2 rows selected (0.453 seconds)
0: jdbc:hive2://192.168.150.150:10000> select tf.* from (select 0) t lateral view stack(2,'A',10,date '2015-01-01','B',20,date '2016-01-01') tf as col0,col1,col2;
+----------+----------+-------------+
| tf.col0  | tf.col1  |   tf.col2   |
+----------+----------+-------------+
| A        | 10       | 2015-01-01  |
| B        | 20       | 2016-01-01  |
+----------+----------+-------------+
2 rows selected (0.189 seconds)

+——————————+
| tab_name |
+——————————+
| ! |
| != |
| $sum0 |
| % |
| & |
| * |

+
/
<
<=
<=>
<>
=
==
>
>=
^
abs
acos
add_months
aes_decrypt
aes_encrypt
and
array
array_contains
ascii
asin
assert_true
assert_true_oom
atan
avg
base64
between
bin
bloom_filter
bround
cardinality_violation
case
cbrt
ceil
ceiling
char_length
character_length
chr
coalesce
collect_list
collect_set
compute_stats
concat
concat_ws
context_ngrams
conv
corr
cos
count
covar_pop
covar_samp
crc32
create_union
cume_dist
current_authorizer
current_database
current_date
current_groups
current_timestamp
current_user
date_add
date_format
date_sub
datediff
day
dayofmonth
dayofweek
decode
degrees
dense_rank
div
e
elt
encode
enforce_constraint
exp
explode
extract_union
factorial
field
find_in_set
first_value
floor
floor_day
floor_hour
floor_minute
floor_month
floor_quarter
floor_second
floor_week
floor_year
format_number
from_unixtime

Hive

本博客所有文章除特别声明外，均采用 CC BY-SA 4.0 协议，转载请注明出处！

HBase热点问题及解决办法上一篇

Hive、HBase、Mysql的区别下一篇