百度智能云

All Product Document

Cloud Monitor

Global model call analysis

Last Updated：2025-11-14

Global model call analysis

Global model call analysis provides a comprehensive view of all model call related data, and supports model switching to view model-specific data.

Overview data:

Panel	Description
LLM call count	Display the count of large model calls across all applications during the specified time period
LLM call QPS	Display the QPS of large model calls across all applications during the specified time period
LLM call error count	Display the count of large model call errors across all applications during the specified time period
LLM call error rate	Display the large model call error rate across all applications during the specified time period
Token usage	Display the token usage across all applications during the specified time period, with options to view input and output
Avg Tokens per LLM call	Display the average token usage per large model call across all applications during the specified time period, with options to view input and output

Model call trend data:

Panel	Description
Trend Chart of LLM Call Count	Display the LLM call count trend chart for all applications by default Support switching LLM call QPS Support switching Avg LLM call per request, which represents the average large model call count per user request
Trend Chart of LLM Call Error Count	Display the trend chart of LLM Call Error Count Across All Applications by default, supporting switching to the Trend Chart of LLM Call Error Rate
Trend Chart of LLM Call Latency	Display the latency trend chart for all applications calling LLM, supporting Avg, p90, p95, and p99 latency
Trend Chart of LLM Call First-token Latency	Display the trend chart for the LLM call time-to-first-token across all applications, supporting Avg, p90, p95, and p99 latency

Token Data Trend Chart:

Panel	Description
Token Usage Trend	Display the token usage trend across all applications during the specified time period, with options to view input and output
Trend of Average Tokens per LLM Call	Display the average token trend per large model call across all applications during the specified time period

Top 5 application rankings:

Panel	Description
Top 5 applications by LLM call count	Based on all application calls to large models, statistics show the Top 5 applications in terms of the large model call count, with support for switching trend chart display
Top 5 applications by LLM call errors	Based on all application calls to large models, statistics show the Top 5 applications in terms of the large model call error, with support for switching trend chart display
Top 5 applications by average LLM call latency	Based on all application calls to large models, statistics show the Top 5 applications in terms of the large model call average latency, with support for switching trend chart display
Top 5 applications by average LLM call time-to-first-token latency	Based on all application calls to large models, statistics show the Top 5 applications in terms of the large models to call the average first token latency, with support for switching trend chart display

Previous

Call chain analysis

Next

Application Monitor