COLLECT_SET
更新时间:2025-10-16
描述
聚合函数,聚合指定列的所有唯一值,去除重复的元素,并返回一个集合类型的结果。
别名
- GROUP_UNIQ_ARRAY
语法
SQL
1COLLECT_SET(<expr> [,<max_size>])
参数
| 参数 | 说明 |
|---|---|
<expr> |
要聚合的列或表达式 |
<max_size> |
可选参数,通过设置该参数能够将结果数组的大小限制为 max_size 个元素 |
返回值
返回类型是 ARRAY,该数组包含去重后的所有值,特殊情况:
- 如果值为 NULL,则会过滤
举例
SQL
1select k1,k2,k3 from collect_set_test order by k1;
Text
1+------+------------+-------+
2| k1 | k2 | k3 |
3+------+------------+-------+
4| 1 | 2023-01-01 | hello |
5| 2 | 2023-01-01 | NULL |
6| 2 | 2023-01-02 | hello |
7| 3 | NULL | world |
8| 3 | 2023-01-02 | hello |
9| 4 | 2023-01-02 | doris |
10| 4 | 2023-01-03 | sql |
11+------+------------+-------+
SQL
1select collect_set(k1),collect_set(k1,2) from collect_set_test;
Text
1+-------------------------+--------------------------+
2| collect_set(`k1`) | collect_set(`k1`,2) |
3+-------------------------+--------------------------+
4| [4,3,2,1] | [1,2] |
5+----------------------------------------------------+
SQL
1select k1,collect_set(k2),collect_set(k3,1) from collect_set_test group by k1 order by k1;
Text
1+------+-------------------------+--------------------------+
2| k1 | collect_set(`k2`) | collect_set(`k3`,1) |
3+------+-------------------------+--------------------------+
4| 1 | [2023-01-01] | [hello] |
5| 2 | [2023-01-01,2023-01-02] | [hello] |
6| 3 | [2023-01-02] | [world] |
7| 4 | [2023-01-02,2023-01-03] | [sql] |
8+------+-------------------------+--------------------------+
