千帆prompt_token用量计费逻辑
更新时间:2024-05-09
本文介绍了token计算规则,并以ERNIE-Bot 为例说明计算过程。
说明:token计算规则,后续会进一步优化完善,请您保持关注。
名词解释
字符
- 1个字母=1个字符,举例,hello=5字符
- 1个汉字=1个字符,举例,你好=2字符
token
大模型中,token是指语言模型中用来表示中文汉字、英文单词、或中英文短语的符号。token可以是单个字符,也可以是多个字符组成的序列。您可以使用token计算器来在线计算部分文心大模型的字符转token数。
输入长度限制逻辑
- 字符长度限制:使用 输入token长度*4 进行第一道拦截,比如ernie-3.5-8k接口,输入token长度限制为5k,字符长度限制为20000,超过返回错误码 336007-the max length of current question is 20000
- token长度限制:使用 token长度进行第二道拦截,超过返回错误码 336103-Prompt tokens too long
- 多轮对话场景,输入token长度不断变长,可能导致输入长度超限,此时用户可以使用千帆SDK提供的自动遗忘历史对话功能,避免输入长度过长。
prompt_tokens计算规则
以下信息参与prompt_tokens计算:
- message中的content
- system
- functions
对于system、functions,有以下API支持此功能:
模型/版本 | system | functions |
---|---|---|
ERNIE 4.0 | ✓ | |
ERNIE-Bot-8K | ✓ | ✓ |
ERNIE-3.5-8K | ✓ | ✓ |
ERNIE-Lite-8K-0922 | ✓ | |
ERNIE Speed-AppBuilder | ✓ |
示例
预期效果
用户调用ERNIE-Bot API能力,请求参数使用了message.content、functions、system等,各请求参数内容如下。
调用接口后,返回prompt_tokens字段值为564,即问题tokens数为564,包括请求参数message.content、functions、system对应的内容。
各请求参数内容
(1)message中content字段内容
"messages": [{
"role": "user",
"content": "当前日期2023-09-30, 查询李四最近五个月旅游过的国内城市"
}]
(2)system字段内容
"system": "你是一个AI智能助手",
(3)functions字段内容
"functions": [
{
"name": "get_domestic_tourist_place",
"description": "当前日期为 2023-09-30,查询用户近期旅游过的三个国内城市",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "用户姓名"
},
"start": {
"type": "string",
"description": "查询的起始时间,默认最近一年"
},
"end": {
"type": "string",
"description": "查询的结束时间,默认当前日期"
}
},
"required": ["name"]
},
"responses": {
"type": "object",
"properties": {
"result": {
"type": "string",
"description": "目的地列表"
}
}
},
"examples": [
{
"role": "user",
"content": "查询张三最近三个月旅游过的国内城市"
},
{
"role": "assistant",
"content": "",
"function_call": {
"name": "get_domestic_tourist_place",
"arguments": "{\"name\": \"张三\", \"start\": \"2023-05-01-01\", \"end\": \"2023-08-30-30\"}"
}
},
{
"role": "function",
"name": "get_domestic_tourist_place",
"content": "{\"result\": \"苏州;上海;北京\"}"
},
{
"role": "assistant",
"content": "张三最近三个月旅游过的国内城市为北京,广州,呼和浩特。"
}
]
},
{
"name": "get_foreign_tourist_place",
"description": "当前日期为 2023-10,查询用户近期旅游过的三个国外城市",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "用户姓名"
},
"start": {
"type": "string",
"description": "查询的起始时间,默认最近一年"
},
"end": {
"type": "string",
"description": "查询的结束时间,默认当前日期"
}
},
"required": ["name"]
},
"responses": {
"type": "object",
"properties": {
"result": {
"type": "string",
"description": "目的地列表"
}
}
},
"examples": [
{
"role": "user",
"content": "查询张三最近三个月旅游过的国外城市"
},
{
"role": "assistant",
"content": "",
"function_call": {
"name": "get_foreign_tourist_place",
"arguments": "{\"name\": \"张三\", \"start\": \"2023-05-01\", \"end\": \"2023-08-30\"}"
}
},
{
"role": "function",
"name": "get_foreign_tourist_place",
"content": "{\"result\": \"仰光;巴瓦提\"}"
},
{
"role": "assistant",
"content": "张三最近三个月旅游过的国外城市为仰光,巴瓦提。"
}
]
}
],
调用ERNIE-Bot API 查看prompt_token
步骤一:使用以下方法调用ERNIE-Bot大模型。
curl --location 'https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/completions?access_token={access_token}' \
--header 'Content-Type: application/json' \
--data '{
"messages":[
{
"role":"user",
"content":"当前日期2023-09-30, 查询李四最近五个月旅游过的国内城市"
}
],
"functions":[
{
"name":"get_domestic_tourist_place",
"description":"当前日期为 2023-09-30,查询用户近期旅游过的三个国内城市",
"parameters":{
"type":"object",
"properties":{
"name":{
"type":"string",
"description":"用户姓名"
},
"start":{
"type":"string",
"description":"查询的起始时间,默认最近一年"
},
"end":{
"type":"string",
"description":"查询的结束时间,默认当前日期"
}
},
"required":[
"name"
]
},
"responses":{
"type":"object",
"properties":{
"result":{
"type":"string",
"description":"目的地列表"
}
}
},
"examples":[
{
"role":"user",
"content":"查询张三最近三个月旅游过的国内城市"
},
{
"role":"assistant",
"content":"",
"function_call":{
"name":"get_domestic_tourist_place",
"arguments":"{\"name\": \"张三\", \"start\": \"2023-05-01-01\", \"end\": \"2023-08-30-30\"}"
}
},
{
"role":"function",
"name":"get_domestic_tourist_place",
"content":"{\"result\": \"苏州;上海;北京\"}"
},
{
"role":"assistant",
"content":"张三最近三个月旅游过的国内城市为北京,广州,呼和浩特。"
}
]
},
{
"name":"get_foreign_tourist_place",
"description":"当前日期为 2023-10,查询用户近期旅游过的三个国外城市",
"parameters":{
"type":"object",
"properties":{
"name":{
"type":"string",
"description":"用户姓名"
},
"start":{
"type":"string",
"description":"查询的起始时间,默认最近一年"
},
"end":{
"type":"string",
"description":"查询的结束时间,默认当前日期"
}
},
"required":[
"name"
]
},
"responses":{
"type":"object",
"properties":{
"result":{
"type":"string",
"description":"目的地列表"
}
}
},
"examples":[
{
"role":"user",
"content":"查询张三最近三个月旅游过的国外城市"
},
{
"role":"assistant",
"content":"",
"function_call":{
"name":"get_foreign_tourist_place",
"arguments":"{\"name\": \"张三\", \"start\": \"2023-05-01\", \"end\": \"2023-08-30\"}"
}
},
{
"role":"function",
"name":"get_foreign_tourist_place",
"content":"{\"result\": \"仰光;巴瓦提\"}"
},
{
"role":"assistant",
"content":"张三最近三个月旅游过的国外城市为仰光,巴瓦提。"
}
]
}
],
"system":"你是一个AI智能助手"
}'
步骤二:成功调用接口,在响应字段中查看usage中prompt_tokens值为564。
{
"id": "as-2kx0t93hip",
"object": "chat.completion",
"created": 1703060563,
"result": "",
"is_truncated": false,
"need_clear_history": false,
"function_call": {
"name": "get_domestic_tourist_place",
"thoughts": "根据用户的请求,我需要调用“get_domestic_tourist_place”工具来查询李四最近五个月旅游过的国内城市。我将设置参数“name”为“李四”,开始和结束日期分别为“2023-04-01”和“2023-09-30”。",
"arguments": "{\"name\":\"李四\",\"start\":\"2023-04-01\",\"end\":\"2023-09-30\"}"
},
"finish_reason": "function_call",
"usage": {
"prompt_tokens": 564,
"completion_tokens": 112,
"total_tokens": 676
}
}
如何计算prompt_token
在使用大模型服务时,可以通过以下方式计算prompt_token对应的值。
以示例中message.content、functions、system等内容为例进行说明计算过程。
计算流程简介
计算prompt_token步骤如下:
步骤一:计算脚本
import json
import time
import requests
# 获取传递给大模型计算token长度的入参
def get_prompt(request):
prompt = ""
for message in request["messages"]:
prompt = prompt + message["content"]
if "system" in request:
prompt = prompt + request["system"]
if "functions" in request:
prompt = prompt + json.dumps(request["functions"], separators=(',', ':'), ensure_ascii=False)
print(prompt)
# 接口完整body入参
call_function = {
"messages": [{
"role": "user",
"content": "当前日期2023-09-30, 查询李四最近五个月旅游过的国内城市"
}],
"functions": [
{
"name": "get_domestic_tourist_place",
"description": "当前日期为 2023-09-30,查询用户近期旅游过的三个国内城市",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "用户姓名"
},
"start": {
"type": "string",
"description": "查询的起始时间,默认最近一年"
},
"end": {
"type": "string",
"description": "查询的结束时间,默认当前日期"
}
},
"required": ["name"]
},
"responses": {
"type": "object",
"properties": {
"result": {
"type": "string",
"description": "目的地列表"
}
}
},
"examples": [
{
"role": "user",
"content": "查询张三最近三个月旅游过的国内城市"
},
{
"role": "assistant",
"content": "",
"function_call": {
"name": "get_domestic_tourist_place",
"arguments": "{\"name\": \"张三\", \"start\": \"2023-05-01-01\", \"end\": \"2023-08-30-30\"}"
}
},
{
"role": "function",
"name": "get_domestic_tourist_place",
"content": "{\"result\": \"苏州;上海;北京\"}"
},
{
"role": "assistant",
"content": "张三最近三个月旅游过的国内城市为北京,广州,呼和浩特。"
}
]
},
{
"name": "get_foreign_tourist_place",
"description": "当前日期为 2023-10,查询用户近期旅游过的三个国外城市",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "用户姓名"
},
"start": {
"type": "string",
"description": "查询的起始时间,默认最近一年"
},
"end": {
"type": "string",
"description": "查询的结束时间,默认当前日期"
}
},
"required": ["name"]
},
"responses": {
"type": "object",
"properties": {
"result": {
"type": "string",
"description": "目的地列表"
}
}
},
"examples": [
{
"role": "user",
"content": "查询张三最近三个月旅游过的国外城市"
},
{
"role": "assistant",
"content": "",
"function_call": {
"name": "get_foreign_tourist_place",
"arguments": "{\"name\": \"张三\", \"start\": \"2023-05-01\", \"end\": \"2023-08-30\"}"
}
},
{
"role": "function",
"name": "get_foreign_tourist_place",
"content": "{\"result\": \"仰光;巴瓦提\"}"
},
{
"role": "assistant",
"content": "张三最近三个月旅游过的国外城市为仰光,巴瓦提。"
}
]
}
],
"system": "你是一个AI智能助手"
}
call_function_payload = json.dumps(call_function, indent=4, ensure_ascii=False)
get_prompt(call_function)
步骤二:查看输出结果
当前日期2023-09-30, 查询李四最近五个月旅游过的国内城市你是一个AI智能助手[{"name":"get_domestic_tourist_place","description":"当前日期为 2023-09-30,查询用户近期旅游过的三个国内城市","parameters":{"type":"object","properties":{"name":{"type":"string","description":"用户姓名"},"start":{"type":"string","description":"查询的起始时间,默认最近一年"},"end":{"type":"string","description":"查询的结束时间,默认当前日期"}},"required":["name"]},"responses":{"type":"object","properties":{"result":{"type":"string","description":"目的地列表"}}},"examples":[{"role":"user","content":"查询张三最近三个月旅游过的国内城市"},{"role":"assistant","content":"","function_call":{"name":"get_domestic_tourist_place","arguments":"{\"name\": \"张三\", \"start\": \"2023-05-01-01\", \"end\": \"2023-08-30-30\"}"}},{"role":"function","name":"get_domestic_tourist_place","content":"{\"result\": \"苏州;上海;北京\"}"},{"role":"assistant","content":"张三最近三个月旅游过的国内城市为北京,广州,呼和浩特。"}]},{"name":"get_foreign_tourist_place","description":"当前日期为 2023-10,查询用户近期旅游过的三个国外城市","parameters":{"type":"object","properties":{"name":{"type":"string","description":"用户姓名"},"start":{"type":"string","description":"查询的起始时间,默认最近一年"},"end":{"type":"string","description":"查询的结束时间,默认当前日期"}},"required":["name"]},"responses":{"type":"object","properties":{"result":{"type":"string","description":"目的地列表"}}},"examples":[{"role":"user","content":"查询张三最近三个月旅游过的国外城市"},{"role":"assistant","content":"","function_call":{"name":"get_foreign_tourist_place","arguments":"{\"name\": \"张三\", \"start\": \"2023-05-01\", \"end\": \"2023-08-30\"}"}},{"role":"function","name":"get_foreign_tourist_place","content":"{\"result\": \"仰光;巴瓦提\"}"},{"role":"assistant","content":"张三最近三个月旅游过的国外城市为仰光,巴瓦提。"}]}]
步骤三:使用token计算器计算prompt_token长度
打开token计算器功能,并将步骤二的结果复制到token计算的文本输入框,点击开始计算prompt_token的长度。计算结果Tokens数量为564,如下图所示。