SHOW-LOAD
SHOW LOAD
Description
This statement is used to show the execution of specified load task.
SHOW LOAD
[FROM db_name]
[
WHERE
[LABEL [ = "your_label" | LIKE "label_matcher"]]
[STATE = ["PENDING"|"LOADING"|"FINISHED"|"CANCELLED"|]]
]
[ORDER BY ...]
[LIMIT limit][OFFSET offset];-
db_nameView load jobs under specified database. If not specified, use the current database.
-
LABELLLabel can be accurately matched by equivalent value, or fuzzy matched by Like.
-
STATEView load tasks in specified status.
- PENDING: load has been submitted, but has not been executed yet.
- LOADING: load is in progress.
- FINISHED: load is finished successfully.
- CANCELLED: load failed.
-
ORDER BY ... LIMIT [OFFSET]It supports ordering result sets by any column.
ORDER BY createtime DESC LIMIT 10,20
The return result of the SHOW LOAD command is as follows:
mysql> show load order by createtime desc limit 1\G
*************************** 1. row ***************************
JobId: 76391
Label: label1
State: FINISHED
Progress: ETL:N/A; LOAD:100%
Type: BROKER
EtlInfo: unselected.rows=4; dpp.abnorm.ALL=15; dpp.norm.ALL=28133376
TaskInfo: cluster:N/A; timeout(s):10800; max_filter_ratio:5.0E-5
ErrorMsg: N/A
CreateTime: 2019-07-27 11:46:42
EtlStartTime: 2019-07-27 11:46:44
EtlFinishTime: 2019-07-27 11:46:44
LoadStartTime: 2019-07-27 11:46:44
LoadFinishTime: 2019-07-27 11:50:16
URL: http://192.168.1.1:8040/api/_load_error_log?file=__shard_4/error_log_insert_stmt_4bb00753932c491a-a6da6e2725415317_4bb00753932c491a_a6da6e2725415317
JobDetails: {"Unfinished backends":{"9c3441027ff948a0-8287923329a2b6a7":[10002]},"ScannedRows":2390016,"TaskNumber":1,"All backends":{"9c3441027ff948a0-8287923329a2b6a7":[10002]},"FileNumber":1,"FileSize":1073741824}The follow mainly introduces that meaning of parameters in the result set return by view load command:
-
JobIdunique ID of load task, JobId in different load tasks is different, which is automatically generated by the system. Unlike Label, JobId will never be the same, and Label can be duplicated after the load task fails.
-
LabelIdentification of load task.
-
StateCurrent stage of the load task. There will be PENDING and LOADING status appearing in the loading process of Broker. If Broker load is in PENDING status, it means that the current load task is waiting to be executed. If in LOADING status, it indicates that it is being executed.
There are two final stages of the load task: CANCELLED and FINISHED, when the Load job is in these two stages, the load is completed. In which CANCELLED means load failed and FINISHED means load succeeded.
-
ProgressDescription of load task progress. There are two kinds of progress: ETL and LOAD, corresponding to the two stages of the load process: ETL and LOADING. At present, Broker load has only LOADING stage, so ETL will always be displayed as
N/AThe progress range of LOAD is: 0~100%。
LOAD progress = the number of tables currently loaded/the total number of tables designed for this load task * 100%If all load tables are loaded, then the progress of LOAD will show 99% and the load will enter the final effective stage. After the whole load is completed, the progress of LOAD will be changed to 100%.
The progress of loading is not linear. Therefore, if the progress does not change within a period of time, it does not mean that the load is not being executed.
-
TypeType of task loaded. Broker load has only the type BROKER.
-
EtlInfoIt mainly displays the loaded data quantity indicators
unselected.rows,dpp.norm.ALLanddpp.abnorm.ALL. Users can judge how many rows are filtered by the where condition according to the first numerical value, and verify whether the error rate of the current load task exceedsmax_filter_ratioby the latter two indicators.The sum of three indicators is the total row number of the original data.
-
TaskInfoIt mainly displays the current load task parameters, that is, the load task parameters specified by the user when creating the Broker load task, including:
cluster,timeoutandmax_filter_ratio. -
ErrorMsgWhen the load task status is CANCELLED, the reason of failure will be displayed in two parts: type and msg, when the load task is successful, display
N/A.The value meaning of type:
USER_CANCEL:task cancelled by user ETL_RUN_FAIL:load task failed in ETL phase ETL_QUALITY_UNSATISFIED:data quality is unqualified, that is, the error data rate exceeds max_filter_ratio LOAD_RUN_FAIL: load task TIMEOUT failed in LOADING phase: TIMEOUT:load task did not complete within timeout period UNKNOWN: unknown load error -
CreateTime/EtlStartTime/EtlFinishTime/LoadStartTime/LoadFinishTimeThese values represent the load create time, ETL stage start time, ETL stage finish time, loading stage start time and the whole load task finish time.
There is no ETL stage in Broker load, so its ETL starttime, ETL finishtime and loadstarttime are set to the same value.
When the load task stays at CreateTime for a long time, while LoadStartTime is N/A, it indicates that load task is seriously piled up at present. Users can reduce the frequency of load submission.
LoadFinishTime - CreateTime = time consumed by the whole load task LoadFinishTime - LoadStartTime = Execution time of the whole Broker load task = Time consumed by the whole load task-Waiting time of the load task -
URLLoad the error data sample of the task, and access the URL address to get the error data sample loaded this time. If there is no error data in this load, the URL field is N/A.
-
JobDetailsDisplay the detailed running status of some jobs, including the number of loaded files, the total size (bytes), the number of subtasks, the number of processed original rows, the BE node Id of running subtasks, and the BE node Id of unfinished subtasks.
{"Unfinished backends":{"9c3441027ff948a0-8287923329a2b6a7":[10002]},"ScannedRows":2390016,"TaskNumber":1,"All backends":{"9c3441027ff948a0-8287923329a2b6a7":[10002]},"FileNumber":1,"FileSize":1073741824}In which the number of processed original rows is updated every 5 seconds. The number of rows is only used to show the current progress, which does not represent the final actual number of rows processed. The actual number of rows processed shall be as shown in EtlInfo.
Example
-
Show all load tasks of the default db
SHOW LOAD; -
Show the load task of specified db, the label contains the string "2014_01_02".
SHOW LOAD FROM example_db WHERE LABEL LIKE "2020_01_02" LIMIT 10; -
展示指定 db 的导入任务,指定 label 为 "load_example_db_20140102" 并按 LoadStartTime 降序排序。
SHOW LOAD FROM example_db WHERE LABEL = "load_example_db_20200102" ORDER BY LoadStartTime DESC; -
Show the load task of the specified db, specify the label as "load_example_db_20140102" and order in descending order of LoadStartTime
SHOW LOAD FROM example_db WHERE LABEL = "load_example_db_20140102" AND STATE = "loading" ORDER BY LoadStartTime DESC; -
展示指定 db 的导入任务 并按 LoadStartTime 降序排序,并从偏移量5开始显示10条查询结果
SHOW LOAD FROM example_db ORDER BY LoadStartTime DESC limit 5,10; SHOW LOAD FROM example_db ORDER BY LoadStartTime DESC limit 10 offset 5;
Keywords
SHOW, LOADBest Practices
-
Use in combination with LOAD command
It is recommended that user view the status of load tasks submitted by polling by
SHOW LOAD. For example, for an load task with a specified Label, the load status is polled at intervals of 10 seconds or longer, and the next operation is performed according to the status.For example, wait for the load task submitted in the previous batch to be completed before submitting in the next batch, or retry the failed task.
-
Query multiple load tasks with the same Label.
If the load task corresponding to a Label fails, the user can continue to use the Label to resubmit the task. In this case, querying the Label will return multiple load jobs (including newly submitted and previously failed). At this time, if to query the latest submission, you can use the following command:
SHOW LOAD WHERE LABEl="my_label" ORDER BY createtime DESC LIMIT 1; -
View data quality problem
If the load task fails and the error message shows
ETL_QUALITY_UNSATISFIED, it indicates that there is load quality problem. A link will be given inURLfield. The error data information given in this link can be viewed with the following command:SHOW LOAD WARNINGS ON "your_url";Here are some common data quality problems:
-
no partition for this tupleThis row of data has no corresponding partition in the table. It is required to check whether the data falls within the partition range of the table.
-
null is not allowed for bitmap columnLoading null values into bitmap type field is not allowed.
-
the length of input is too long than schemaThe length of string exceeds the length defined by the column. It is required to consider increasing the defined maximum length of VARCHAR column.
-
decimal value is not valid for definitionDecimal type does not match in precision.
-
Content of HLL type column is invalidThe value type corresponding to HLL column is not correct. It is required to check whether the hll_hash function is used to convert the data.
-
null value for not null columnTry to load a null value into a column that is not nullable
-
Parse json data for JsonDoc failedParsing Json data failed.
-
JSON Root not foundGiven json_root does not match.
-
JSON data is array-object, strip_outer_array must be TRUE.strip_outer_arrayis false, but the Json data is an array. -
JSON data is not an array-object, strip_outer_array must be FALSEstrip_outer_arrayis true, but the Json data is not an array. -
Json value is null, but the column xx is not nullableWhile loading Json data, try to load null into a field that is not nullable.
-
The column xx is not nullable, but it's not found in jsondata.When loading Json data, no field matched, and the coloumn is not nullable.
-
All fields is null, this is a invalid row.When loading Json data, all field names in a row cannot match the data.
-
Empty json lineEmpty Json data.
-
data is not encoded by UTF-8Source data is not UTF-8 encoded.
-
actual column number is less than schema column number.The number of columns in the source data is less than that in the table.
-
actual column number is more than schema column number.The number of columns in the source data is more than that in the table.
-
column xx value is incorrect while strict mode is trueIn Strict Mode,the source data is converted to null.
-
column xx value is null while columns is not nullableTry to load null value into the coloumn is not nullable.
-
-
View the progress of job execution
As the progress displayed in
Progressin the returned result is not linear, sometimes it is inconvenient to query progress through this field. At this time, we can also observe whether the job is running throughScannedRowsinJobDetailsfield.Under normal circumstances, the
ScannedRowsof a running job will be updated every 5 seconds, indicating the number of rows of data that have been read. But sometimes, due to the data quality problems, all data rows have been filtered, resulting in theScannedRowsnever increasing or being 0. At this time,All backendsandUnfinished backendsfields can also be viewed. The former represents all BE nodes involved in this load task, while the latter represents the BE nodes of unfinished tasks. -
About TaskNumber
The TaskNumber field in
JobDetailsdoes not represent the concurrency of load jobs. This only represents the number ofDATA INFILEclauses in the Load statement.
