百度智能云

All Product Document

          Cloud Monitor

          LLM Application Access

          LLM (Large Language Model) application performance monitor tracks core metrics such as inference latency, throughput, and token usage in real-time, supports LLM-specific span collection, and visually displays end-to-end call chain details for precise optimization and efficient operations. The following will introduce how to onboard LLM applications:

          Supported LLM components and frameworks

          LLM applications can be onboarded using Traceloop's open-source OpenLLMetry project, which is fully compatible with the OpenTelemetry protocol standards, thus enabling chain data interoperability with other applications using OpenTelemetry. OpenLLMetry project supports automatic instrumentation for numerous LLM components and frameworks. Please visit Project Home for all supported components and frameworks. A subset is listed below for your reference.

          Supported LLM components and frameworks Link
          Supported LLM frameworks Ollama、LlamaIndex、LangChain、Haystack、LiteLLM、CrewAI
          Supported vector databases Chroma、Pinecone、Qdrant、Weaviate、Milvus、Marqo、LanceDB
          AI platforms and services supporting vector operations VertexAI、OpenAI、MistralAI

          Supported language: Python

          Protocol type: OpenTelemetry

          Onboarding process

          Traceloop is a widely-used, standardized Python SDK that provides comprehensive and standardized telemetry trace data. The upcoming section outlines the process for onboarding Traceloop.

          Step 1: Retrieve the access point and Token (this information varies by region and user, and can be obtained on the Onboard Application page in the console; an example is provided below)

          • Endpoint: apm-collector.bj.baidubce.com
          • Authentication:UFSpMM***lnFrVBqtPDK

          Step 2: Install Traceloop-SDK

          Install the Traceloop SDK using the pip command. This installation includes dependencies for OpenLLMetry and OpenTelemetry-SDK.

          pip install traceloop-sdk

          Step 3: Initialize configuration

          In the main file of your Python project, import Traceloop and OpenTelemetry-SDK, then initialize them using the endpoint and authentication details obtained in prior steps.

          # -*- coding: utf-8 -*-
          import os
          from traceloop.sdk import Traceloop
          
          os.environ['TRACELOOP_BASE_URL'] = "<endpoint>"  # Replace with the endpoint retrieved from access
          os.environ['TRACELOOP_HEADERS'] = "Authentication=<authentication>"  #  Replace with the authentication retrieved from the access point
          Traceloop.init(app_name='<server.name>',  # Set the current application name
                         resource_attributes={
                             "host.name": "<host.name>"  # Set the current host name
                         })
                      
                      

          Corresponding fields are described as follows:

          Parameters Description Example
          <serviceName> Application name: If multiple processes use the same application name, they will appear as separate instances under the same application in APM. For Spring Cloud or Dubbo applications, the application name is generally consistent with the service name. csm
          <Authentication> Application system authentication obtained in Step 1 UFSpMM***lnFrVBqtPDK
          <endpoint> Endpoint obtained in Step 1 http://10.169.25.203:8810
          The hostname of the instance serves as the unique identifier for the application instance and is usually set as the IP address of the application instance. my-instance-host

          Step 4: Unboarding verification

          After launching the Python application, When LLM call exists, onboarded applications will be displayed under LLM Application Performance Monitor → Application List. Due to processing latency, if no application can be queried in the console after onboarding, please wait approximately 30 seconds.

          Practical onboarding examples

          Example 1

          Access Qianfan APIs via the OpenAI SDK, and include telemetry data during the process.

          import os
          from traceloop.sdk import Traceloop
          from openai import OpenAI
          ai_model = 'ernie-4.0-turbo-128k'
          api_key = "bce-v3/ALTAK-mTNsaNCr4********" #Replace with Qianfan's API-Key
          base_url = "https://qianfan.baidubce.com/v2" #Qianfan's endpoint
          
          os.environ['TRACELOOP_BASE_URL'] = "<endpoint>"  # Replace with the endpoint retrieved from access
          os.environ['TRACELOOP_HEADERS'] = "Authentication=<authentication>"  #  Replace with the authentication retrieved from the access point
          Traceloop.init(app_name='<server.name>',  # Set the current application name
                         resource_attributes={
                             "host.name": "<host.name>"  # Set the current host name
                         })
          if __name__ == '__main__':
              llm = OpenAI(
                  api_key=api_key,
                  base_url=base_url
              )
              response = llm.chat.completions.create(
                  model=ai_model,
                  messages=[
                      {
                          "role": "user",
                          "content": "hello"
                      },
                  ],
              )

          Example 2

          Telemetry uses LLM data from the Langchain framework

          import os
          from traceloop.sdk import Traceloop
          from langgraph.graph import StateGraph
          from typing import Dict, TypedDict
          from openai import OpenAI
          import langchain
          langchain.debug = True
          ai_model = 'ernie-4.0-turbo-128k'
          api_key = "bce-v3/ALTAK-mTNsaNCr4********" #Replace with Qianfan's API-Key
          base_url = "https://qianfan.baidubce.com/v2" #Qianfan's endpoint
          
          os.environ['TRACELOOP_BASE_URL'] = "<endpoint>"  # Replace with the endpoint retrieved from access
          os.environ['TRACELOOP_HEADERS'] = "Authentication=<authentication>"  #  Replace with the authentication retrieved from the access point
          Traceloop.init(app_name='<server.name>',  # Set the current application name
                         resource_attributes={
                             "host.name": "<host.name>"  # Set the current host name
                         })
          class GraphState(TypedDict):
              keys: Dict[str, any]
          def query(state):
              print("---START----")
              state_dict = state["keys"]
              question = state_dict["question"]
              llm = OpenAI(
                  api_key=api_key,
                  base_url=base_url
              )
              response = llm.chat.completions.create(
                  model=ai_model,
                  messages=[
                      {
                          "role": "user",
                          "content": question,
                      },
                  ],
              )
              return {"keys": {"content": response, "question": question}}
          def end(state):
              print("---END----")
              return {
                  "keys": {
                      "content": state["keys"]["content"],
                      "question": state["keys"]["question"],
                  }
              }
          if __name__ == '__main__':
              workflow = StateGraph(GraphState)
              workflow.add_node("query", query)
              workflow.add_node("end", end)
              workflow.add_edge("query", "end")
              workflow.set_entry_point("query")
              app = workflow.compile()
              app.invoke(input={"keys": {"question": "what is the weather today?"}}, debug=True)```

          Example 3: Customized instrumentation enhancement

          If you need to track application scenarios beyond LLM-related frameworks and libraries, you can refer to the following content and use the OpenTelemetry API to add custom instrumentation. This document only demonstrates the most basic custom instrumentation methods. The OpenTelemetry community offers more flexible approaches. For details, refer to the [Python Custom Instrumentation Documentation] provided by OpenTelemetry. (https://opentelemetry.io/docs/languages/python/instrumentation/)

          import os
          from traceloop.sdk import Traceloop
          from openai import OpenAI
          from opentelemetry import trace  # Import chain-related SDK
          ai_model = 'ernie-4.0-turbo-128k'
          api_key = "bce-v3/ALTAK-mTNsaNCr4********" #Replace with Qianfan's API-Key
          base_url = "https://qianfan.baidubce.com/v2" #Qianfan's endpoint
          
          os.environ['TRACELOOP_BASE_URL'] = "<endpoint>"  # Replace with the endpoint retrieved from access
          os.environ['TRACELOOP_HEADERS'] = "Authentication=<authentication>"  #  Replace with the authentication retrieved from the access point
          Traceloop.init(app_name='<server.name>',  # Set the current application name
                         resource_attributes={
                             "host.name": "<host.name>"  # Set the current host name
                         })
          tracer = trace.get_tracer(__name__)  # init tracer
          def chat():
              llm = OpenAI(
                  api_key=api_key,
                  base_url=base_url
              )
              response = llm.chat.completions.create(
                  model=ai_model,
                  messages=[
                      {
                          "role": "user",
                          "content": "hello"
                      },
                  ],
              )
          if __name__ == '__main__':
              with tracer.start_as_current_span("entry func") as span:  # create a span
                  span.set_attribute("user.id", "<userId>")  # set span attributes: userId
                  span.set_attribute("session.id", "<sessionId>")  # set span attributes: sessionId
                  chat()
          Previous
          Application list
          Next
          Application Performance Monitor