SHOW DATA SKEW
更新时间:2025-10-15
描述
SHOW DATA SKEW 语句用于查看表或分区的数据倾斜情况。该语句具有以下功能:
- 可以查看整个表的数据分布情况
- 可以查看指定分区的数据分布情况
- 展示各个分桶的数据行数、数据量及其占比
- 支持分区表和非分区表
语法
SQL
1SHOW DATA SKEW FROM [<db_name>.]<table_name> [ PARTITION (<partition_name> [, ...]) ];
必选参数
1. FROM [<db_name>.]<table_name>
指定要查看的表名。可以包含数据库名称。
表名在其所在的数据库中必须唯一。
可选参数
1. PARTITION (<partition_name> [, ...])
指定要查看的分区名称列表。
如果不指定此参数,则展示表中所有分区的数据分布情况。
对于非分区表,分区名称同表名。
返回值
| 列名 | 说明 |
|---|---|
| PartitionName | 分区名称 |
| BucketIdx | 分桶索引号 |
| AvgRowCount | 平均行数 |
| AvgDataSize | 平均数据大小(字节) |
| Graph | 数据分布可视化图表 |
| Percent | 该分桶数据量占总数据量的百分比 |
权限控制
执行此 SQL 命令的用户必须至少具有以下权限:
| 权限(Privilege) | 对象(Object) | 说明(Notes) |
|---|---|---|
| SELECT | 表(Table) | 需要对查看的表有 SELECT 权限 |
注意事项
- 数据分布情况按照分区和分桶两个维度展示
- Graph 列使用字符
>直观展示数据分布比例 - 百分比精确到小数点后两位
- 对于非分区表,查询结果中分区名称同表名
示例
-
创建分区表并查看数据分布:
SQL1CREATE TABLE test_show_data_skew 2( 3 id int, 4 name string, 5 pdate date 6) 7PARTITION BY RANGE(pdate) 8( 9 FROM ("2023-04-16") TO ("2023-04-20") INTERVAL 1 DAY 10) 11DISTRIBUTED BY HASH(id) BUCKETS 5 12PROPERTIES ( 13 "replication_num" = "1" 14);查看整表数据分布:
SQL1SHOW DATA SKEW FROM test_show_data_skew;Text1+---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ 2| PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | 3+---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ 4| p_20230416 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.77 % | 5| p_20230416 | 1 | 2 | 654 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.23 % | 6| p_20230416 | 2 | 0 | 0 | | 00.00 % | 7| p_20230416 | 3 | 0 | 0 | | 00.00 % | 8| p_20230416 | 4 | 0 | 0 | | 00.00 % | 9| p_20230417 | 0 | 0 | 0 | | 00.00 % | 10| p_20230417 | 1 | 0 | 0 | | 00.00 % | 11| p_20230417 | 2 | 0 | 0 | | 00.00 % | 12| p_20230417 | 3 | 0 | 0 | | 00.00 % | 13| p_20230417 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% | 14| p_20230418 | 0 | 0 | 0 | | 00.00 % | 15| p_20230418 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% | 16| p_20230418 | 2 | 0 | 0 | | 00.00 % | 17| p_20230418 | 3 | 0 | 0 | | 00.00 % | 18| p_20230418 | 4 | 0 | 0 | | 00.00 % | 19| p_20230419 | 0 | 0 | 0 | | 00.00 % | 20| p_20230419 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.96 % | 21| p_20230419 | 2 | 0 | 0 | | 00.00 % | 22| p_20230419 | 3 | 1 | 649 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.04 % | 23| p_20230419 | 4 | 0 | 0 | | 00.00 % | 24+---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ -
查看指定分区的数据分布:
SQL1SHOW DATA SKEW FROM test_show_data_skew PARTITION(p_20230416, p_20230418);Text1+---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ 2| PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | 3+---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ 4| p_20230416 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.77 % | 5| p_20230416 | 1 | 2 | 654 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.23 % | 6| p_20230416 | 2 | 0 | 0 | | 00.00 % | 7| p_20230416 | 3 | 0 | 0 | | 00.00 % | 8| p_20230416 | 4 | 0 | 0 | | 00.00 % | 9| p_20230418 | 0 | 0 | 0 | | 00.00 % | 10| p_20230418 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% | 11| p_20230418 | 2 | 0 | 0 | | 00.00 % | 12| p_20230418 | 3 | 0 | 0 | | 00.00 % | 13| p_20230418 | 4 | 0 | 0 | | 00.00 % | 14+---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+ -
查看非分区表的数据分布:
SQL1CREATE TABLE test_show_data_skew2 2( 3 id int, 4 name string, 5 pdate date 6) 7DISTRIBUTED BY HASH(id) BUCKETS 5 8PROPERTIES ( 9 "replication_num" = "1" 10);SQL1SHOW DATA SKEW FROM test_show_data_skew2;Text1+----------------------+-----------+-------------+-------------+---------------------------+---------+ 2| PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent | 3+----------------------+-----------+-------------+-------------+---------------------------+---------+ 4| test_show_data_skew2 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.73 % | 5| test_show_data_skew2 | 1 | 4 | 667 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.46 % | 6| test_show_data_skew2 | 2 | 0 | 0 | | 00.00 % | 7| test_show_data_skew2 | 3 | 1 | 649 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.77 % | 8| test_show_data_skew2 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.04 % | 9+----------------------+-----------+-------------+-------------+---------------------------+---------+
