logo
6

Langchain 千帆从入门到实战

前言

在进入本章之前,我们先简单介绍一下Langchain,他是一个可以帮助用户快速构建从原型到生产的LLM应用的框架。其封装了包括LLM,Embedding,Chain,Agent,Tool等一系列抽象的LLM应用组件,也在开源社区的不断贡献下集成了当前大部分主流的大语言模型等调用方法。是当前最火热的大语言模型应用框架。
针对Langchain中常见的应用场景,我们基于千帆SDK实现了如下的几种能力:
  • llms: 针对于基础的LLM的纯文本补全能力设计的基础类型,对应千帆SDK的Completion能力
  • chat_models:针对于对话能力设计的带有角色历史消息的基础类型,对应千帆SDK的ChatCompletion能力
  • embeddings:针对于语料向量化的能力设计的基础类型,对应千帆SDK的Embedding能力

API初始化

要使用基于百度千帆的服务,必须先使用在控制台获取的应用接入AK/SK初始化以下参数,也可以选择在环境变量或初始化params中传入AK、SK:
  
  
  
  
  
  
export QIANFAN_AK=XXX
export QIANFAN_SK=XXX

对话补全:

使用chat_models让大模型生成一个笑话,这里默认使用的是ERNIE-Bot-turbo模型。

  
  
  
  
  
  
"""For basic init and call"""
from langchain.chat_models import QianfanChatEndpoint
from langchain.chat_models.base import HumanMessage
import os
os.environ["QIANFAN_AK"] = "your_ak"
os.environ["QIANFAN_SK"] = "your_sk"
chat = QianfanChatEndpoint(
streaming=True,
)
res = chat([HumanMessage(content="write a funny joke")])
  
  
  
  
  
  
from langchain.chat_models import QianfanChatEndpoint
from langchain.schema import HumanMessage
chatLLM = QianfanChatEndpoint(
streaming=True,
)
res = chatLLM.stream([HumanMessage(content="hi")], streaming=True)
for r in res:
print("chat resp:", r)
async def run_aio_generate():
resp = await chatLLM.agenerate(messages=[[HumanMessage(content="write a 20 words sentence about sea.")]])
print(resp)
await run_aio_generate()
async def run_aio_stream():
async for res in chatLLM.astream([HumanMessage(content="write a 20 words sentence about sea.")]):
print("astream", res)
await run_aio_stream()
  
  
  
  
  
  
chat resp: content='您好,您似乎输入' additional_kwargs={} example=False
chat resp: content='了一个话题标签,请问需要我帮您找到什么资料或者帮助您解答什么问题吗?' additional_kwargs={} example=False
chat resp: content='' additional_kwargs={} example=False
generations=[[ChatGeneration(text="The sea is a vast expanse of water that covers much of the Earth's surface. It is a source of travel, trade, and entertainment, and is also a place of scientific exploration and marine conservation. The sea is an important part of our world, and we should cherish and protect it.", generation_info={'finish_reason': 'finished'}, message=AIMessage(content="The sea is a vast expanse of water that covers much of the Earth's surface. It is a source of travel, trade, and entertainment, and is also a place of scientific exploration and marine conservation. The sea is an important part of our world, and we should cherish and protect it.", additional_kwargs={}, example=False))]] llm_output={} run=[RunInfo(run_id=UUID('d48160a6-5960-4c1d-8a0e-90e6b51a209b'))]
astream content='The sea is a vast' additional_kwargs={} example=False
astream content=' expanse of water, a place of mystery and adventure. It is the source of many cultures and civilizations, and a center of trade and exploration. The sea is also a source of life and beauty, with its unique marine life and diverse' additional_kwargs={} example=False
astream content=' coral reefs. Whether you are swimming, diving, or just watching the sea, it is a place that captivates the imagination and transforms the spirit.' additional_kwargs={} example=False

使用不同模型​接入

对于想基于Ernie-Bot-turbo或第三方开源模型部署自己的模型,你可以遵循以下步骤:
  
  
  
  
  
  
chatBloom = QianfanChatEndpoint(
streaming=True,
model="BLOOMZ-7B",
)
res = chatBloom([HumanMessage(content="hi")])
print(res)
  
  
  
  
  
  
content='你好!很高兴见到你。' additional_kwargs={} example=False

模型超参

目前"ERNIE-Bot-4",“ERNIE Bot”和“ERNIE Bot turbo”还支持下面的模型参数:
  • temperature
  • top_p
  • penalty_score
  
  
  
  
  
  
res = chat.stream([HumanMessage(content="hi")], **{'top_p': 0.4, 'temperature': 0.1, 'penalty_score': 1})
for r in res:
print(r)
  
  
  
  
  
  
content='您好,您似乎输入' additional_kwargs={} example=False
content='了一个文本字符串,但并没有给出具体的问题或场景。' additional_kwargs={} example=False
content='如果您能提供更多信息,我可以更好地回答您的问题。' additional_kwargs={} example=False
content='' additional_kwargs={} example=False

Embeddings:

  
  
  
  
  
  
from langchain.embeddings import QianfanEmbeddingsEndpoint
import os
os.environ["QIANFAN_AK"] = "your_ak"
os.environ["QIANFAN_SK"] = "your_sk"
embed = QianfanEmbeddingsEndpoint(
# qianfan_ak='xxx',
# qianfan_sk='xxx'
)
res = embed.embed_documents(["hi", "world"])
async def aioEmbed():
res = await embed.aembed_query("qianfan")
print(res[:8])
await aioEmbed()
import asyncio
async def aioEmbedDocs():
res = await embed.aembed_documents(["hi", "world"])
for r in res:
print("", r[:8])
await aioEmbedDocs()
  
  
  
  
  
  
[-0.03313107788562775, 0.052325375378131866, 0.04951248690485954, 0.0077608139254152775, -0.05907672271132469, -0.010798933915793896, 0.03741293027997017, 0.013969100080430508]
[0.0427522286772728, -0.030367236584424973, -0.14847028255462646, 0.055074431002140045, -0.04177454113960266, -0.059512972831726074, -0.043774791061878204, 0.0028191760648041964]
[0.03803155943751335, -0.013231384567916393, 0.0032379645854234695, 0.015074018388986588, -0.006529552862048149, -0.13813287019729614, 0.03297128155827522, 0.044519297778606415]

知识库问答实战:

基于最常见的知识库QA应用,我们基于Langchain和qianfan实现了一套文档QA的最佳实践:
评论
用户头像