模型列表
更新时间:2025-09-09
推荐模型
旗舰模型 | ERNIE-X1.1-Preview | ERNIE-4.5-Turbo-128K | ERNIE-4.5-Turbo-VL-32K | DeepSeek-R1 |
---|---|---|---|---|
使用场景 | 核心定位: 在问答、工具调用、智能体、指令遵循、逻辑推理、数学、代码任务的效果显著提升,事实性显著提升;上下文长度扩展到64K tokens,支持更长的输入与对话历史,在保持响应速度的同时,提高了长链路推理的连贯性。 |
核心定位:更好的满足多轮长历史对话处理、长文档理解问答任务。 适用场景: 1)复杂语义理解:支持中文知识问答、文学创作,尤其擅长文档理解(如DocVQA任务)。 2)数学推理:在中文数学问题(CMath基准)表现突出。 |
核心定位:多模态基础模型,支持文本、图像跨模态输入与生成。 适用场景:结合图文生成营销文案、视频脚本设计等。 |
核心定位:专业优化推理模型,聚焦数学与逻辑任务。 适用场景: 复杂数学问题:如高等数学题求解、科学计算模拟。 逻辑拆解与规划:业务流程自动化、学术研究中的假设验证。 STEM领域应用:物理建模、金融量化分析等需高精度推理的场景。 |
上下文长度 (Token数) |
64k | 128k | 32k | 96k |
最大输出长度 (Token数) |
64k | 16k | 12k | 16k 默认4k |
文本生成
ERNIE系列-旗舰模型
模型名称 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
默认流控 |
---|---|---|---|---|---|
ERNIE 4.5 Turbo | ernie-4.5-turbo-128k | 128k | 123k | [2,12288] | RPM = 5000 TPM = 400000 |
ERNIE 4.5 Turbo | ernie-4.5-turbo-128k-preview | 128k | 123k | [2,12288] | RPM = 60 TPM = 150000 |
ERNIE 4.5 Turbo | ernie-4.5-turbo-32k | 32k | 27k | [2,12288] | RPM = 5000 TPM = 400000 |
ERNIE 4.5 Turbo | ernie-4.5-turbo-latest | 128k | 123k | [2,12288] | RPM = 60 TPM = 150000 |
ERNIE 4.5 Turbo VL | ernie-4.5-turbo-vl-preview | 128K | 123K | [2,16384] | RPM = 60 TPM = 150000 |
ERNIE 4.5 Turbo VL | ernie-4.5-turbo-vl | 128k | 123k | [2,16384] | RPM = 1000 TPM = 200000 |
ERNIE 4.5 Turbo VL | ernie-4.5-turbo-vl-32k | 32k | 27k | [2,12288] | RPM = 1000 TPM = 200000 |
ERNIE 4.5 Turbo VL | ernie-4.5-turbo-vl-32k-preview | 32k | 27k | [2,16384] | RPM = 1000 TPM = 200000 |
ERNIE 4.5 Turbo VL | ernie-4.5-turbo-vl-latest | 128k | 123k | [2,16384] | RPM = 60 TPM = 150000 |
ERNIE 4.5 | ernie-4.5-8k-preview | 8k | 5k | [2,2048] | RPM = 100 TPM = 100000 |
ERNIE系列-主力模型
模型名称 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
默认流控 |
---|---|---|---|---|---|
ERNIE Speed | ernie-speed-128k | 128k | 124k | [2,4096] | RPM = 500 TPM = 200000 |
ERNIE Speed | ernie-speed-8k | 8k | 6k | [2,2048] | RPM = 500 TPM = 200000 |
ERNIE Speed | ernie-speed-pro-128k | 128k | 124k | [2,4096] | RPM = 10000 TPM = 800000 |
ERNIE Lite | ernie-lite-8k | 8k | 6k | [2,2048] | RPM = 500 TPM = 200000 |
ERNIE Lite | ernie-lite-pro-128k | 128k | 124k | [2,4096] | RPM = 10000 TPM = 800000 |
ERNIE系列-轻量模型
模型名称 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
默认流控 |
---|---|---|---|---|---|
ERNIE Tiny | ernie-tiny-8k | 8k | 6k | [2,2048] | RPM = 10000 TPM = 800000 |
ERNIE系列-垂直场景模型
模型名称 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
默认流控 |
---|---|---|---|---|---|
ERNIE Character | ernie-char-8k | 8k | 7k | [2,2048] | RPM = 60 TPM = 60000 |
ERNIE Character | ernie-char-8k-1010 | 8k | 6k | [2,2048] | RPM = 60 TPM = 60000 |
ERNIE Character | ernie-char-fiction-8k | 8k | 8k | [2,2048] | RPM = 300 TPM = 300000 |
ERNIE Character | ernie-char-fiction-8k-preview | 8k | 8k | [2,2048] | RPM = 60 TPM = 6000 |
ERNIE Novel | ernie-novel-8k | 8k | 5k | [2,2048] | RPM = 60 TPM = 60000 |
ERNIE系列-开源模型
模型名称 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
默认流控 |
---|---|---|---|---|---|
ERNIE 4.5 | ernie-4.5-0.3b | 128k | 120k | [1,8192] 默认 4k |
RPM = 120 TPM = 150000 |
ERNIE 4.5 | ernie-4.5-21b-a3b | 128k | 120k | [1,8192] 默认 4k |
RPM = 120 TPM = 150000 |
ERNIE 4.5 | ernie-4.5-vl-28b-a3b | 32k | 30k | [1,8192] 默认 4k |
RPM = 120 TPM =150000 |
QianFan系列
模型名称 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
默认流控 |
---|---|---|---|---|---|
Qianfan-8B | qianfan-8b | 32k | 32k | [2,16384] 默认 4k |
RPM = 60 TPM = 60000 |
Qianfan-70B | qianfan-70b | 32k | 32k | [2,16384] 默认 4k |
RPM = 60 TPM = 60000 |
Qianfan Agent | qianfan-agent-intent-32k | 32k | 28k | [2,4096] 默认 2k |
RPM = 60 TPM = 60000 |
Qianfan Agent | qianfan-agent-lite-8k | 8k | 7k | [2,2048] 默认 1k |
RPM = 60 TPM = 60000 |
Qianfan Agent | qianfan-agent-speed-32k | 32k | 28k | [2.4096] | RPM = 5000 TPM = 400000 |
Qianfan Agent | qianfan-agent-speed-8k | 8k | 7k | [2,2048] 默认 1k |
RPM = 180 TPM = 180000 |
Qianfan Chinese Llama | qianfan-chinese-llama-2-13b | 2k | 4800字符 | [2,2048] 默认 1k |
RPM = 60 TPM = 60000 |
Qianfan-Sug | qianfan-sug-8k | 8k | 6k | [2,2048] 默认 1k |
RPM = 500 TPM = 200000 |
Qianfan-Correct | qianfan-correct | 8k | 4k | [2,8000] 默认4k |
RPM = 60 TPM = 60000 |
Qianfan-ToyTalk | qianfan-toytalk | 32k | 28k | [2,32768] 默认4k |
RPM = 60 TPM = 60000 |
DeepSeek系列
模型名称 | 版本 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
默认流控 |
---|---|---|---|---|---|---|
DeepSeek-V3.1 | DeepSeek-V3.1-250821 | deepseek-v3.1-250821 | 128k | 128k | 16k 默认4k |
RPM = 60 TPM = 150000 |
DeepSeek-V3 | DeepSeek-V3-250324 | deepseek-v3 | 128k | 128k | 16k 默认4k |
RPM = 5000 TPM = 1000000 |
其他
模型名称 | 版本 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
默认流控 |
---|---|---|---|---|---|---|
Kimi-K2 | Kimi-K2-Instruct | kimi-k2-instruct | 128k | 128k | [1,32768] 默认 4k |
RPM = 60 TPM = 150000 |
Qwen3 | Qwen3-Coder-480B-A35B-Instruct | qwen3-coder-480b-a35b-instruct | 128k | 128k | [1,65536] 默认4k |
RPM = 60 TPM = 150000 |
Qwen3 | Qwen3-Coder-30B-A3B-Instruct | qwen3-coder-30b-a3b-instruct | 128k | 128k | [1,32768] 默认4k |
RPM = 60 TPM = 150000 |
Qwen3 | Qwen3-235B-A22B-Instruct-2507 | qwen3-235b-a22b-instruct-2507 | 128k | 128k | [1,16384] 默认4k |
RPM = 60 TPM = 150000 |
Qwen3 | Qwen3-30B-A3B-Instruct-2507 | qwen3-30b-a3b-instruct-2507 | 128k | 126k | [1,32768] 默认4k |
RPM = 60 TPM = 150000 |
Qwen3 | Qwen3-235B-A22B | qwen3-235b-a22b | 32k | 30k | [2,8192] 默认4k |
RPM = 120 TPM = 150000 |
Qwen3 | Qwen3-30B-A3B | qwen3-30b-a3b | 32k | 30k | [2,8192] 默认4k |
RPM = 120 TPM = 150000 |
Qwen3 | Qwen3-32B | qwen3-32b | 32k | 30k | [2,8192] 默认4k |
RPM = 120 TPM = 150000 |
Qwen3 | Qwen3-14B | qwen3-14b | 32k | 30k | [2,8192] 默认4k |
RPM = 120 TPM = 150000 |
Qwen3 | Qwen3-8B | qwen3-8b | 32k | 30k | [2,8192] 默认4k |
RPM = 120 TPM = 150000 |
Qwen3 | Qwen3-4B | qwen3-4b | 32k | 30k | [2,8192] 默认4k |
RPM = 120 TPM = 150000 |
Qwen3 | Qwen3-1.7B | qwen3-1.7b | 32k | 30k | [2,8192] 默认4k |
RPM = 60 TPM = 60000 |
Qwen3 | Qwen3-0.6B | qwen3-0.6b | 32k | 30k | [2,8192] 默认4k |
RPM = 60 TPM = 60000 |
Qwen2.5 | Qwen2.5-7B-Instruct | qwen2.5-7b-instruct | 32k | 24k | [2,8192] 默认4k |
RPM = 60 TPM = 60000 |
GLM-4 | GLM-4-32B-0414 | glm-4-32b-0414 | 32k | 16k | [2,8192] 默认 4k |
RPM = 120 TPM = 60000 |
Llama-4-Maverick | Llama-4-Maverick-17B-128E-Instruct | llama-4-maverick-17b-128e-instruct | 128k | 131072字符 | [2,8192] 默认 4k |
RPM = 120 TPM = 150000 |
Llama-4-Scout | Llama-4-Scout-17B-16E-Instruct | llama-4-scout-17b-16e-instruct | 128k | 131072字符 | [2,8192] 默认 4k |
RPM = 120 TPM = 150000 |
Meta-Llama-3 | Meta-Llama-3-70B | meta-llama-3-70b | 8k | 20000字符 | 500 | RPM = 120 TPM = 120000 |
Meta-Llama-3 | Meta-Llama-3-8B | meta-llama-3-8b | 8k | 20000字符 | 1500 | RPM = 60 TPM = 60000 |
视觉理解
ERNIE系列
模型名称 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
默认流控 |
---|---|---|---|---|---|
ERNIE 4.5 Turbo VL | ernie-4.5-turbo-vl-preview | 128K | 123K | [2,16384] | RPM = 60 TPM = 150000 |
ERNIE 4.5 Turbo VL | ernie-4.5-turbo-vl | 128k | 123k | [2,16384] | RPM = 1000 TPM = 200000 |
ERNIE 4.5 Turbo VL | ernie-4.5-turbo-vl-32k | 32k | 27k | [2,12288] | RPM = 1000 TPM = 200000 |
ERNIE 4.5 Turbo VL | ernie-4.5-turbo-vl-32k-preview | 32k | 27k | [2,16384] | RPM = 1000 TPM = 200000 |
ERNIE 4.5 Turbo VL | ernie-4.5-turbo-vl-latest | 128k | 123k | [2,16384] | RPM = 60 TPM = 150000 |
ERNIE 4.5 | ernie-4.5-8k-preview | 8k | 5k | [2,2048] | RPM = 100 TPM = 100000 |
ERNIE 4.5 | ernie-4.5-vl-28b-a3b | 32k | 30k | [1,8192] 默认 4k |
RPM = 120 TPM =150000 |
QianFan系列
模型名称 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
默认流控 |
---|---|---|---|---|---|
Qianfan-Composition | qianfan-composition | 32k | 24k | [2,8192] 默认2k |
RPM = 60 TPM = 60000 |
Qianfan-Check-VL | qianfan-check-vl | 128k | 128k | [1,131072] 默认 4k |
RPM = 60 TPM = 150000 |
Qianfan-MultiPicOCR | qianfan-multipicocr | 128k | 128k | [2,32768] 默认 16k |
RPM = 60 TPM = 60000 |
Qianfan-VL-70B | qianfan-vl-70b | 32k | 32k | [1,28672] | RPM = 60 TPM = 60000 |
Qianfan-VL-8B | qianfan-vl-8b | 32k | 32k | [1,28672] | RPM = 60 TPM = 60000 |
Qianfan-Llama-VL-8B | qianfan-llama-vl-8b | 32k | 32k | 16k 默认 2k |
RPM = 120 TPM = 150000 |
Qianfan-QI-VL | qianfan-qi-vl | 128k | 128k | [1,131072] 默认 4k |
RPM = 60 TPM = 150000 |
Qianfan-PublicOpinion-Classification | qianfan-publicopinion-classification | 28k | 24k | [2,4096] 默认 4k |
RPM = 60 TPM = 60000 |
Qianfan-EngCard-VL | qianfan-engcard-vl | 4k | 4k | [0,4000] 默认 4k |
RPM = 60 TPM = 150000 |
Qianfan-SinglePicOCR | qianfan-singlepicocr | 4k | 4k | [2,4096] 默认 4k |
RPM = 60 TPM = 150000 |
InternVL系列
模型名称 | 版本 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
默认流控 |
---|---|---|---|---|---|---|
InternVL3 | InternVL3-38B | internvl3-38b | 32k | 24k | 8k 默认2k |
RPM = 60 TPM = 60000 |
InternVL3 | InternVL3-14B | internvl3-14b | 32k | 24k | 8k 默认2k |
RPM = 60 TPM = 60000 |
InternVL3 | InternVL3-1B | internvl3-1b | 32k | 24k | 8k 默认2k |
RPM = 60 TPM = 60000 |
InternVL2_5 | InternVL2_5-38B-MPO | internvl2.5-38b-mpo | 32k | 64000字符 | 4k 默认2k |
RPM = 60 TPM = 60000 |
QwenVL系列
模型名称 | 版本 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
默认流控 |
---|---|---|---|---|---|---|
Qwen2.5-VL | Qwen2.5-VL-32B-Instruct | qwen2.5-vl-32b-instruct | 32k | 64000字符 | 8k 默认2k |
RPM = 60 TPM = 60000 |
Qwen2.5-VL | Qwen2.5-VL-7B-Instruct | qwen2.5-vl-7b-instruct | 16k | 38400字符 | 4k 默认2k |
RPM = 60 TPM = 60000 |
其他
模型名称 | 版本 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
默认流控 |
---|---|---|---|---|---|---|
GLM-4.5V | GLM-4.5V | glm-4.5v | 64k | 64k | 16k 默认4k |
RPM = 60 TPM = 150000 |
DeepSeek-VL2 | DeepSeek-VL2 | deepseek-vl2 | 4k | 12000字符 | 2k 默认2k |
RPM = 60 TPM = 60000 |
DeepSeek-VL2 | DeepSeek-VL2-Small | deepseek-vl2-small | 4k | 38400字符 | 2k 默认2k |
RPM = 60 TPM = 60000 |
Llama-4-Maverick | Llama-4-Maverick-17B-128E-Instruct | llama-4-maverick-17b-128e-instruct | 128k | 131072字符 | [2,8192] 默认 4k |
RPM = 120 TPM = 150000 |
Llama-4-Scout | Llama-4-Scout-17B-16E-Instruct | llama-4-scout-17b-16e-instruct | 128k | 131072字符 | [2,8192] 默认 4k |
RPM = 120 TPM = 150000 |
深度思考
ERNIE系列
模型名称 | 版本 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
思维链长度 (token) |
默认流控 |
---|---|---|---|---|---|---|---|
ERNIE X1.1 | ERNIE-X1.1-Preview | ernie-x1.1-preview | 64k | 55k | [1,65536] | 64k | RPM = 60 TPM = 60000 |
ERNIE X1 Turbo | ERNIE-X1-Turbo-32K | ernie-x1-turbo-32k | 32k | 23k | [1,28160] | 28k | RPM = 900 TPM = 300000 |
ERNIE X1 Turbo | ERNIE-X1-Turbo-32K-Preview | ernie-x1-turbo-32k-preview | 32k | 23k | [1,28160] | 28k | RPM = 60 TPM = 60000 |
ERNIE X1 Turbo | ERNIE-X1-Turbo-Latest | ernie-x1-turbo-latest | 64k | 55k | [2,65536] | 16k | RPM = 60 TPM = 60000 |
ERNIE 4.5 | ERNIE-4.5-VL-28B-A3B | ernie-4.5-vl-28b-a3b | 32k | 16k | [1,8192] 默认4k |
16k | RPM = 120 TPM = 150000 |
ernie系列思考模型,不支持thinking_budget。同时,max_tokens限制reasoning_content+content总长度。
DeepSeek满血版
模型名称 | 版本 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
思维链长度 (token) |
默认流控 |
---|---|---|---|---|---|---|---|
DeepSeek-V3.1-Think | DeepSeek-V3.1-Think-250821 | deepseek-v3.1-think-250821 | 128k | 96k | [1,32768] 默认 4k |
32k | RPM = 60 TPM = 150000 |
DeepSeek-R1 | DeepSeek-R1-250528 | deepseek-r1-250528 | 144k | 96k | [1,32768] 默认 4k |
32k | RPM = 5000 TPM = 1000000 |
DeepSeek-R1 | DeepSeek-R1 当前为250120版本 |
deepseek-r1 | 144k | 96k | [1,32768] 默认 4k |
32k | RPM = 5000 TPM = 1000000 |
DeepSeek蒸馏版
模型名称 | 版本 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
思维链长度 (token) |
默认流控 |
---|---|---|---|---|---|---|---|
DeepSeek-R1-Distill | DeepSeek-R1-Distill-Qianfan-70B | deepseek-r1-distill-qianfan-70b | 32k | 16k | [1,8192] 默认 8k |
16k | RPM = 1000 TPM = 60000 |
DeepSeek-R1-Distill | DeepSeek-R1-Distill-Qianfan-8B | deepseek-r1-distill-qianfan-8b | 32k | 16k | [1,8192] 默认 8k |
16k | RPM = 1000 TPM = 60000 |
DeepSeek-R1-Distill | DeepSeek-R1-Distill-Qianfan-Llama-70B | deepseek-r1-distill-qianfan-llama-70b | 32k | 64000字符 | [1,8192] 默认 4k |
32k | RPM = 1000 TPM = 10000 |
DeepSeek-R1-Distill | DeepSeek-R1-Distill-Qianfan-Llama-8B | deepseek-r1-distill-qianfan-llama-8b | 32k | 64000字符 | [1,8192] 默认 4k |
32k | RPM = 1000 TPM = 10000 |
DeepSeek-R1-Distill | DeepSeek-R1-Distill-Llama-70B | deepseek-r1-distill-llama-70b | 32k | 64000字符 | [1,8192] 默认 4k |
32k | RPM = 1000 TPM = 10000 |
DeepSeek-R1-Distill | DeepSeek-R1-Distill-Llama-8B | deepseek-r1-distill-llama-8b | 32k | 64000字符 | [1,8192] 默认 4k |
32k | RPM = 1000 TPM = 10000 |
DeepSeek-R1-Distill | DeepSeek-R1-Distill-Qwen-32B | deepseek-r1-distill-qwen-32b | 32k | 64000字符 | [1,8192] 默认 4k |
32k | RPM = 1000 TPM = 10000 |
DeepSeek-R1-Distill | DeepSeek-R1-Distill-Qwen-14B | deepseek-r1-distill-qwen-14b | 32k | 64000字符 | [1,8192] 默认 4k |
32k | RPM = 1000 TPM = 10000 |
DeepSeek-R1-Distill | DeepSeek-R1-Distill-Qwen-7B | deepseek-r1-distill-qwen-7b | 32k | 64000字符 | [1,8192] 默认 4k |
32k | RPM = 1000 TPM = 10000 |
DeepSeek-R1-Distill | DeepSeek-R1-Distill-Qwen-1.5B | deepseek-r1-distill-qwen-1.5b | 32k | 64000字符 | [1,8192] 默认 4k |
32k | RPM = 1000 TPM = 10000 |
Qwen系列
模型名称 | 版本 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
思维链长度 (token) |
默认流控 |
---|---|---|---|---|---|---|---|
Qwen3 | Qwen3-235B-A22B-Thinking-2507 | qwen3-235b-a22b-thinking-2507 | 128k | 124k | [1,32768] 默认 4k |
32k | RPM = 60 TPM = 150000 |
Qwen3 | Qwen3-30B-A3B-Thinking-2507 | qwen3-30b-a3b-thinking-2507 | 128k | 124k | [1,32768] 默认 4k |
32k | RPM = 60 TPM = 150000 |
Qwen3 | Qwen3-235B-A22B | qwen3-235b-a22b | 32k | 16k | [2,8192] 默认 4k |
16k | RPM = 120 TPM = 150000 |
Qwen3 | Qwen3-30B-A3B | qwen3-30b-a3b | 32k | 16k | [2,8192] 默认 4k |
16k | RPM = 120 TPM = 150000 |
Qwen3 | Qwen3-32B | qwen3-32b | 32k | 16k | [1,8192] 默认 4k |
16k | RPM = 120 TPM = 150000 |
Qwen3 | Qwen3-14B | qwen3-14b | 32k | 16k | [1,8192] 默认 4k |
16k | RPM = 120 TPM = 150000 |
Qwen3 | Qwen3-8B | qwen3-8b | 32k | 16k | [1,8192] 默认 4k |
16k | RPM = 120 TPM = 150000 |
Qwen3 | Qwen3-4B | qwen3-4b | 32k | 16k | [1,8192] 默认 4k |
16k | RPM = 120 TPM = 150000 |
Qwen3 | Qwen3-1.7B | qwen3-1.7b | 32k | 16k | [1,8192] 默认 4k |
16k | RPM = 60 TPM = 60000 |
Qwen3 | Qwen3-0.6B | qwen3-0.6b | 32k | 16k | [1,8192] 默认 4k |
16k | RPM = 60 TPM = 60000 |
QWQ-32B | QWQ-32B | qwq-32b | 32k | 65536字符 | [1,8192] 默认 4k |
32k | RPM = 120 TPM = 100000 |
其他
模型名称 | 版本 | model参数 接入点ID |
上下文长度 (token) |
最大输入 (token) |
最大输出 (token) |
思维链长度 (token) |
默认流控 |
---|---|---|---|---|---|---|---|
GPT-OSS-120B | GPT-OSS-120B | gpt-oss-120b | 128k | 124k | [1,32768] 默认 4k |
32k | RPM = 60 TPM= 150000 |
GPT-OSS-20B | GPT-OSS-20B | gpt-oss-20b | 128k | 124k | [1,32768] 默认 4k |
32k | RPM = 60 TPM= 150000 |
GLM-Z1-32B-0414 | GLM-Z1-32B-0414 | glm-z1-32b-0414 | 32k | 16k | [2,8192] 默认 4k |
16k | RPM=120 TPM=60000 |
GLM-Z1-Rumination-32B-0414 | GLM-Z1-Rumination-32B-0414 | glm-z1-rumination-32b-0414 | 128k | 64k | [2,8192] 默认 4k |
32k | RPM = 120 TPM = 150000 |
图像生成
模型名称 | 版本 | model参数 接入点ID |
最大输入(字符) | 默认流控 |
---|---|---|---|---|
ERNIE iRAG | ERNIE-iRAG-1.0 | irag-1.0 | 200字符 | 6RPM |
FLUX.1-schnell | FLUX.1-schnell | flux.1-schnell | 512字符 | 6RPM |
图像编辑
模型名称 | 版本 | model参数 接入点ID |
最大输入(字符) | 默认流控 |
---|---|---|---|---|
ERINE iRAG Edit | ERNIE-iRAG-Edit-1.0 | ernie-irag-edit | 220字符 | 6RPM |
视频生成
模型名称 | 版本 | model入参 | 输入限制 | 速率限制 |
---|---|---|---|---|
百度蒸汽机2.0(MuseSteamer 2.0) | MuseSteamer-2.0-Turbo-I2V-Audio | musesteamer-2.0-turbo-i2v-audio |
输入Prompt:建议中文400字以内,最多不超过3000个字符 输入图片:支持JPEG、JPG、PNG、WEBP格式;文件大小不超过10MB,尺寸不小于300px |
共享并发数:3 共享排队数:10 |
MuseSteamer-2.0-Turbo-I2V | musesteamer-2.0-turbo-i2v | |||
MuseSteamer-2.0-Pro-I2V | musesteamer-2.0-pro-i2v | |||
MuseSteamer-2.0-Lite-I2V | musesteamer-2.0-lite-i2v | |||
MuseSteamer-2.0-Turbo-Effect | musesteamer-2.0-turbo-i2v-effect |
百度蒸汽机2.0(MuseSteamer 2.0)视频生成模型流控说明:
- 共享并发数:3,表达视频生成任务 运行中 状态的最大为3个
- 共享排队数:10,表示视频生成任务 排队中 状态的最大为10个
- 不同版本模型流控共享。
文本向量
模型名称 | 版本 | model参数 接入点ID |
向量维度 | 最大输入(文本数量) | 每个文本上下文长度 (token) |
---|---|---|---|---|---|
Embedding-V1 | Embedding-V1 | embedding-v1 | 384 | 16 | 384 |
tao-8k | tao-8k | tao-8k | 1024 | 1 | 8192 |
bge-large-zh | bge-large-zh | bge-large-zh | 1024 | 16 | 512 |
bge-large-en | bge-large-en | bge-large-en | 1024 | 16 | 512 |
Qwen3-Embedding-0.6B | Qwen3-Embedding-0.6B | qwen3-embedding-0.6b | 1024 | 16 | 8192 |
Qwen3-Embedding-4B | Qwen3-Embedding-4B | qwen3-embedding-4b | 2560 | 16 | 8192 |
Qwen3-Embedding-8B | Qwen3-Embedding-8B | qwen3-embedding-8b | 4096 | 16 | 8192 |
多模态向量
模型名称 | 版本 | model参数 接入点ID |
向量维度 | 最大输入(数量) | 上下文长度 |
---|---|---|---|---|---|
gme-Qwen2-VL-2B-Instruct | gme-Qwen2-VL-2B- Instruct | gme-qwen2-vl-2b-instruct | 1536 | 1段文本信息、1段图片信息、1段图片信息+1段文本信息 | 32k |
重排序
模型名称 | 版本 | model参数 接入点ID |
最大输入(字符) | 默认流控 |
---|---|---|---|---|
bce-reranker-base | bce-reranker-base | bce-reranker-base | query:400 tokens/1600字符 document:1K tokens/4000字符 |
RPM = 1800 TPM = 800000 |
Qwen3-Reranker-0.6B | Qwen3-Reranker-0.6B | qwen3-reranker-0.6b | 输入最大32k query:1条 |
RPM = 1800 TPM = 800000 |
Qwen3-Reranker-4B | Qwen3-Reranker-4B | qwen3-reranker-4b | 输入最大32k query:1条 |
RPM = 1800 TPM = 800000 |
Qwen3-Reranker-8B | Qwen3-Reranker-8B | qwen3-reranker-8b | 输入最大32k query:1条 |
RPM = 1800 TPM = 800000 |