LLM Application Access
LLM (Large Language Model) application performance monitor tracks core metrics such as inference latency, throughput, and token usage in real-time, supports LLM-specific span collection, and visually displays end-to-end call chain details for precise optimization and efficient operations. The following will introduce how to onboard LLM applications:
Supported LLM components and frameworks
LLM applications can be onboarded using Traceloop's open-source OpenLLMetry project, which is fully compatible with the OpenTelemetry protocol standards, thus enabling chain data interoperability with other applications using OpenTelemetry. OpenLLMetry project supports automatic instrumentation for numerous LLM components and frameworks. Please visit Project Home for all supported components and frameworks. A subset is listed below for your reference.
| Supported LLM components and frameworks | Link |
|---|---|
| Supported LLM frameworks | Ollama、LlamaIndex、LangChain、Haystack、LiteLLM、CrewAI |
| Supported vector databases | Chroma、Pinecone、Qdrant、Weaviate、Milvus、Marqo、LanceDB |
| AI platforms and services supporting vector operations | VertexAI、OpenAI、MistralAI |
Supported language: Python
Protocol type: OpenTelemetry
Onboarding process
Traceloop is a widely-used, standardized Python SDK that provides comprehensive and standardized telemetry trace data. The upcoming section outlines the process for onboarding Traceloop.
Step 1: Retrieve the access point and Token (this information varies by region and user, and can be obtained on the Onboard Application page in the console; an example is provided below)
- Endpoint: apm-collector.bj.baidubce.com
- Authentication:UFSpMM***lnFrVBqtPDK
Step 2: Install Traceloop-SDK
Install the Traceloop SDK using the pip command. This installation includes dependencies for OpenLLMetry and OpenTelemetry-SDK.
pip install traceloop-sdkStep 3: Initialize configuration
In the main file of your Python project, import Traceloop and OpenTelemetry-SDK, then initialize them using the endpoint and authentication details obtained in prior steps.
# -*- coding: utf-8 -*-
import os
from traceloop.sdk import Traceloop
os.environ['TRACELOOP_BASE_URL'] = "<endpoint>" # Replace with the endpoint retrieved from access
os.environ['TRACELOOP_HEADERS'] = "Authentication=<authentication>" # Replace with the authentication retrieved from the access point
Traceloop.init(app_name='<server.name>', # Set the current application name
resource_attributes={
"host.name": "<host.name>" # Set the current host name
})
Corresponding fields are described as follows:
| Parameters | Description | Example |
|---|---|---|
| <serviceName> | Application name: If multiple processes use the same application name, they will appear as separate instances under the same application in APM. For Spring Cloud or Dubbo applications, the application name is generally consistent with the service name. | csm |
| <Authentication> | Application system authentication obtained in Step 1 | UFSpMM***lnFrVBqtPDK |
| <endpoint> | Endpoint obtained in Step 1 | http://10.169.25.203:8810 |
| The hostname of the instance serves as the unique identifier for the application instance and is usually set as the IP address of the application instance. | my-instance-host |
Step 4: Unboarding verification
After launching the Python application, When LLM call exists, onboarded applications will be displayed under LLM Application Performance Monitor → Application List. Due to processing latency, if no application can be queried in the console after onboarding, please wait approximately 30 seconds.
Practical onboarding examples
Example 1
Access Qianfan APIs via the OpenAI SDK, and include telemetry data during the process.
import os
from traceloop.sdk import Traceloop
from openai import OpenAI
ai_model = 'ernie-4.0-turbo-128k'
api_key = "bce-v3/ALTAK-mTNsaNCr4********" #Replace with Qianfan's API-Key
base_url = "https://qianfan.baidubce.com/v2" #Qianfan's endpoint
os.environ['TRACELOOP_BASE_URL'] = "<endpoint>" # Replace with the endpoint retrieved from access
os.environ['TRACELOOP_HEADERS'] = "Authentication=<authentication>" # Replace with the authentication retrieved from the access point
Traceloop.init(app_name='<server.name>', # Set the current application name
resource_attributes={
"host.name": "<host.name>" # Set the current host name
})
if __name__ == '__main__':
llm = OpenAI(
api_key=api_key,
base_url=base_url
)
response = llm.chat.completions.create(
model=ai_model,
messages=[
{
"role": "user",
"content": "hello"
},
],
)Example 2
Telemetry uses LLM data from the Langchain framework
import os
from traceloop.sdk import Traceloop
from langgraph.graph import StateGraph
from typing import Dict, TypedDict
from openai import OpenAI
import langchain
langchain.debug = True
ai_model = 'ernie-4.0-turbo-128k'
api_key = "bce-v3/ALTAK-mTNsaNCr4********" #Replace with Qianfan's API-Key
base_url = "https://qianfan.baidubce.com/v2" #Qianfan's endpoint
os.environ['TRACELOOP_BASE_URL'] = "<endpoint>" # Replace with the endpoint retrieved from access
os.environ['TRACELOOP_HEADERS'] = "Authentication=<authentication>" # Replace with the authentication retrieved from the access point
Traceloop.init(app_name='<server.name>', # Set the current application name
resource_attributes={
"host.name": "<host.name>" # Set the current host name
})
class GraphState(TypedDict):
keys: Dict[str, any]
def query(state):
print("---START----")
state_dict = state["keys"]
question = state_dict["question"]
llm = OpenAI(
api_key=api_key,
base_url=base_url
)
response = llm.chat.completions.create(
model=ai_model,
messages=[
{
"role": "user",
"content": question,
},
],
)
return {"keys": {"content": response, "question": question}}
def end(state):
print("---END----")
return {
"keys": {
"content": state["keys"]["content"],
"question": state["keys"]["question"],
}
}
if __name__ == '__main__':
workflow = StateGraph(GraphState)
workflow.add_node("query", query)
workflow.add_node("end", end)
workflow.add_edge("query", "end")
workflow.set_entry_point("query")
app = workflow.compile()
app.invoke(input={"keys": {"question": "what is the weather today?"}}, debug=True)```Example 3: Customized instrumentation enhancement
If you need to track application scenarios beyond LLM-related frameworks and libraries, you can refer to the following content and use the OpenTelemetry API to add custom instrumentation. This document only demonstrates the most basic custom instrumentation methods. The OpenTelemetry community offers more flexible approaches. For details, refer to the [Python Custom Instrumentation Documentation] provided by OpenTelemetry. (https://opentelemetry.io/docs/languages/python/instrumentation/)
import os
from traceloop.sdk import Traceloop
from openai import OpenAI
from opentelemetry import trace # Import chain-related SDK
ai_model = 'ernie-4.0-turbo-128k'
api_key = "bce-v3/ALTAK-mTNsaNCr4********" #Replace with Qianfan's API-Key
base_url = "https://qianfan.baidubce.com/v2" #Qianfan's endpoint
os.environ['TRACELOOP_BASE_URL'] = "<endpoint>" # Replace with the endpoint retrieved from access
os.environ['TRACELOOP_HEADERS'] = "Authentication=<authentication>" # Replace with the authentication retrieved from the access point
Traceloop.init(app_name='<server.name>', # Set the current application name
resource_attributes={
"host.name": "<host.name>" # Set the current host name
})
tracer = trace.get_tracer(__name__) # init tracer
def chat():
llm = OpenAI(
api_key=api_key,
base_url=base_url
)
response = llm.chat.completions.create(
model=ai_model,
messages=[
{
"role": "user",
"content": "hello"
},
],
)
if __name__ == '__main__':
with tracer.start_as_current_span("entry func") as span: # create a span
span.set_attribute("user.id", "<userId>") # set span attributes: userId
span.set_attribute("session.id", "<sessionId>") # set span attributes: sessionId
chat()