logo
4

LangChain和千帆文生图API的开发经验分享

前几天百度开放了文生图API(Stable-Diffusion-XL)的测试,这几天有空也就试用了一下,发现存在一些问题:如果提示词为中文根本就不能工作,都生成山水水墨画的内容,根本就不是你想要的,后来改成英文的输出结果基本就靠谱了,这下就面临一个问题,我们中文环境如何使用呢?总不能每次都通过文心一言翻译后,再把英文提示词给文生图吧!咱既然是工程师,就要遇到问题解决问题,让API使用起来更便捷。LangChain是为了复杂任务而设计的大模型开发框架,那就用它来实现中文文生图的功能。

废话不多,直接上完整源代码:

  
  
  
  
  
  
# 使用LangChain结合openAI chatGPT4和文心一言文生图进行绘画创作
import requests
import json
import base64
from PIL import Image
import os,io
import time
from langchain.agents import tool
from langchain.agents import Tool
from langchain.chat_models import (
ChatOpenAI
)
from langchain.schema import SystemMessage
from langchain.agents import OpenAIFunctionsAgent
from langchain.agents import AgentExecutor
API_KEY = "zQ3fMzn************DQm"
SECRET_KEY = "MDW1LY*********NGEKESRA8qTpxwfs"
os.environ["OPENAI_API_KEY"]="sk-wO808pkIz10***********rjv1mZHK3t1oEL9ACu0"
@tool
def stable_diffusion(key_str:str):
"""百度文生图"""
keys=key_str.split(',')
if len(keys)>=2:
url = "https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/text2image/sd_xl?access_token=" + keys[0]
payload = json.dumps(
{
"prompt":','.join(keys[1:]),
"negative_prompt":"",
"n":1,
})
headers = {
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
# 字符串转化为 json
result = json.loads(response.text)
if not result.get('error_code'):
datas=result['data']
for data in datas:
# 获取当前时间戳
timestamp = int(time.time())
# 创建文件名
filename = f"image_{timestamp}.png"
# 定义存储路径
output_path = "./data/result/images/"
if not os.path.exists(output_path):
os.makedirs(output_path)
# 获取Base64编码的图片数据
image_b64 = data.get('b64_image')
# 将Base64编码的图片数据解码为二进制数据
binary_data = base64.b64decode(image_b64)
# 将二进制数据读取到内存中,并创建一个PIL图像对象
image = Image.open(io.BytesIO(binary_data))
# 将图片保存到本地文件
image.save(os.path.join(output_path, filename))
return os.path.join(output_path, filename)
return None
@tool
def get_access_token(key_str:str):
"""百度的access_token"""
keys=key_str.split(',')
if len(keys)==2:
url = "https://aip.baidubce.com/oauth/2.0/token"
params = {"grant_type": "client_credentials", "client_id": keys[0], "client_secret": keys[1]}
result=requests.post(url, params=params)
if result.status_code==200:
return result.json().get("access_token")
return None
if __name__ == '__main__':
# 定义工具集
tools=[
Tool(
name='stable_diffusion',
func=stable_diffusion,
description="百度文生图"
),
Tool(
name='get_access_token',
func=get_access_token,
description="百度的access token"
),
]
system_message = SystemMessage(content="你是一个AI助手,完成下面的工作,如果失败就再尝试一次,如果没有顺利完成所有任务就不要输出结果,结果用markdown格式输出:")
prompt = OpenAIFunctionsAgent.create_prompt(
system_message=system_message,
)
# 用openAI大模型进行任务分解
llm=ChatOpenAI(model="gpt-4",temperature=0.9)
agent = OpenAIFunctionsAgent(llm=llm, tools=tools, prompt=prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
messages=[
"在海边看日出的中国情侣,采用油画风格描摹。他们两个背朝大海,朝阳从他们身后射出,男的帅气,女的美丽,天气非常的晴朗,万里无云,海风轻轻吹,海面微波荡漾。",
"公元10000年人类乘坐太空电梯到空间站,广角镜头的视角,要求有很强的科幻感。"
]
# 计算结果写入到./data/result目录下的result.md文件中
with open("./data/result/text2pic.md", "w") as f:
for message in messages:
r=agent_executor.run(f"首先把 {message} 翻译成英文,再根据 {API_KEY},{SECRET_KEY} 获取百度的access_token,然后把 access_token,第一步翻译的结果 交给百度文生图。")
f.write(r+"\n")

关键点

  1. 要使用百度API得先通过ak,sk获取access token,再通过access token调用API,所以用LangChain的思维模式就将这两步做成两个tool分别实现,让LangChain的Agent来调度
  2. LangChain调用tool函数的时候一般用一个字符串传递参数,所以tool函数就要处理传入的字符串参数,通过split来分离参数,chatGPT返回的结果一般都是文本的,特别是英文都是用小写逗号断句的,所以我们在split参数的时候要注意不要把英文语句拆解成tool的参数
  3. 我们这里通过LangChain的ChatOpenAI来调度tool,而且tool里边又调用了千帆大模型的API,而每个tool的返回结果又要通过ChatGPT来解读,所以不要把tool里边调用千帆API的全部结果都返回,只返回对任务调度有用的信息(本程序里边就不能把生成的图片base64返回,否则ChatGPT回爆掉,而是把图片存储到文件,只把文件名称返回),这样ChatGPT回按照要求生成对应的MD
  4. 各个环节的提示词依然很重要,首先是系统提示词,要明确告诉LangChain你下面任务的角色定位、目标要求、限制条件等(例如:要求重试的机制)
  5. LangChain任务分解目前只支持openAI的模型,不知道文心一言的模型什么时候能支持,毕竟调用openAI要技术手段
  6. 这里边的中文翻译为英文是由ChatGPT完成的,其实也可以千帆大模型实现,也做成一个tool函数

执行过程如下:

  
  
  
  
  
  
(FastSAM) crazyicelee@lizhengbing yiyan_plugin_ymos % /Users/crazyicelee/miniconda3/envs/FastSAM/bin/python /Users/crazyicelee/MiniProjects/yiyan_plugin_ymos/yiyan/yiyan_api.py
> Entering new AgentExecutor chain...
Invoking: `get_access_token` with `zQ3fMznSDnxNtUH7QeclgDQm,MDW1LYn1ErSRDR6aNGEKESRA8qTpxwfs`
24.42f240ab73c949ff540f0b6d496e1a38.2592000.1702283141.282335-36768657
Invoking: `stable_diffusion` with `A Chinese couple watching the sunrise by the sea, portrayed in an oil painting style. They both face the sea, the morning sun shines from behind them, the man is handsome, the woman is beautiful, the weather is very clear, cloudless, the sea breeze gently blows, the sea is slightly rippling. 24.42f240ab73c949ff540f0b6d496e1a38.2592000.1702283141.282335-36768657`
responded: 将 "在海边看日出的中国情侣,采用油画风格描摹。他们两个背朝大海,朝阳从他们身后射出,男的帅气,女的美丽,天气非常的晴朗,万里无云,海风轻轻吹,海面微波荡漾。" 翻译成英文是 "A Chinese couple watching the sunrise by the sea, portrayed in an oil painting style. They both face the sea, the morning sun shines from behind them, the man is handsome, the woman is beautiful, the weather is very clear, cloudless, the sea breeze gently blows, the sea is slightly rippling."。
将此结果和获取的access_token24.42f240ab73c949ff540f0b6d496e1a38.2592000.1702283141.282335-36768657,交给百度文生图。
None
Invoking: `stable_diffusion` with `A Chinese couple watching the sunrise by the sea, portrayed in an oil painting style. They both face the sea, the morning sun shines from behind them, the man is handsome, the woman is beautiful, the weather is very clear, cloudless, the sea breeze gently blows, the sea is slightly rippling. 24.42f240ab73c949ff540f0b6d496e1a38.2592000.1702283141.282335-36768657`
None对不起,我尝试了两次,但仍无法完成你的请求。
> Finished chain.
> Entering new AgentExecutor chain...
Invoking: `stable_diffusion` with `In the year 10,000 AD, humans ride space elevators to the space station, viewed from a wide-angle lens, with a strong sense of science fiction.`
None
Invoking: `get_access_token` with `zQ3fMznSDnxNtUH7QeclgDQm,MDW1LYn1ErSRDR6aNGEKESRA8qTpxwfs`
24.336b99cb753b19461ae96655c12c5bf8.2592000.1702283189.282335-36768657
Invoking: `stable_diffusion` with `24.336b99cb753b19461ae96655c12c5bf8.2592000.1702283189.282335-36768657`
None
Invoking: `stable_diffusion` with `In the year 10,000 AD, humans ride space elevators to the space station, viewed from a wide-angle lens, with a strong sense of science fiction.`
None
Invoking: `get_access_token` with `zQ3fMznSDnxNtUH7QeclgDQm,MDW1LYn1ErSRDR6aNGEKESRA8qTpxwfs`
24.d8986cb6335929b107dfda2c15565614.2592000.1702283201.282335-36768657
Invoking: `stable_diffusion` with `24.d8986cb6335929b107dfda2c15565614.2592000.1702283201.282335-36768657`
None
> Finished chain.
第一次尝试了2次都失败了,原因是access_token和prompt顺序反了,导致tool解析错误,也不是每次都错
  
  
  
  
  
  
(FastSAM) crazyicelee@lizhengbing yiyan_plugin_ymos % /Users/crazyicelee/miniconda3/envs/FastSAM/bin/python /Users/crazyicelee/MiniProjects/yiyan_plugin_ymos/yiyan/yiyan_api.py
> Entering new AgentExecutor chain...
Invoking: `stable_diffusion` with `A couple in China watching the sunrise by the sea, depicted in oil painting style. They both face the sea with the sunrise shooting from behind them. The man is handsome, the woman is beautiful, the weather is very clear, cloudless, the sea breeze is gently blowing, and the sea is slightly rippling.`
None
Invoking: `get_access_token` with `zQ3fMznSDnxNtUH7QeclgDQm,MDW1LYn1ErSRDR6aNGEKESRA8qTpxwfs`
24.54cd2476082f1043dc54833a63c96b70.2592000.1702283322.282335-36768657
Invoking: `stable_diffusion` with `24.54cd2476082f1043dc54833a63c96b70.2592000.1702283322.282335-36768657, A couple in China watching the sunrise by the sea, depicted in oil painting style. They both face the sea with the sunrise shooting from behind them. The man is handsome, the woman is beautiful, the weather is very clear, cloudless, the sea breeze is gently blowing, and the sea is slightly rippling.`
./data/result/images/image_1699691337.png# Result
The image created by Baidu Stable Diffusion with the description "A couple in China watching the sunrise by the sea, depicted in oil painting style. They both face the sea with the sunrise shooting from behind them. The man is handsome, the woman is beautiful, the weather is very clear, cloudless, the sea breeze is gently blowing, and the sea is slightly rippling." and the access token "24.54cd2476082f1043dc54833a63c96b70.2592000.1702283322.282335-36768657" is:
![Image](./data/result/images/image_1699691337.png)
> Finished chain.
> Entering new AgentExecutor chain...
Invoking: `get_access_token` with `zQ3fMznSDnxNtUH7QeclgDQm,MDW1LYn1ErSRDR6aNGEKESRA8qTpxwfs`
24.0e619fbbdb504a19810cfbea9c691ef2.2592000.1702283351.282335-36768657
Invoking: `stable_diffusion` with `24.0e619fbbdb504a19810cfbea9c691ef2.2592000.1702283351.282335-36768657,In the year 10,000 AD, humans ride space elevators to the space station, from a wide-angle lens perspective, with a strong sense of science fiction.`
./data/result/images/image_1699691366.png# Result
![Image](./data/result/images/image_1699691366.png)
> Finished chain.
改进了prompt后结果正确了,其实就是改进了一下prompt,明确告诉参数顺序
  
  
  
  
  
  
r=agent_executor.run(f"首先把 {message} 翻译成英文,再根据 {API_KEY},{SECRET_KEY} 获取百度的access_token,然后把 access_token,第一步翻译的结果 按照access_token在前翻译结果在后的顺序交给百度文生图。")

文生图的结果

Result

The image created by Baidu Stable Diffusion with the description "A couple in China watching the sunrise by the sea, depicted in oil painting style. They both face the sea with the sunrise shooting from behind them. The man is handsome, the woman is beautiful, the weather is very clear, cloudless, the sea breeze is gently blowing, and the sea is slightly rippling." and the access token "24.54cd2476082f1043dc54833a63c96b70.2592000.1702283322.282335-36768657" is:

Result

评论
用户头像