简介:本文为Windows用户提供完整的本地化AI Agent部署方案,涵盖Deepseek模型、联网搜索及RAGFlow框架的安装配置,助力开发者快速构建私有化智能系统。
Windows 10/11专业版(64位)是最低要求,建议配置16GB以上内存、NVIDIA显卡(CUDA 11.8+)及至少100GB可用存储空间。若使用CPU模式,需关闭其他高负载程序。
通过Anaconda创建独立Python环境:
conda create -n ai_agent python=3.10
conda activate ai_agent
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
推荐使用Deepseek-R1-7B或14B量化版本,通过HuggingFace下载:
git lfs install
git clone https://huggingface.co/deepseek-ai/Deepseek-R1-7B-Q4_K_M
注意:7B模型约14GB,14B约28GB,需确保磁盘空间充足。
安装vLLM作为推理后端:
pip install vllm
创建启动脚本start_vllm.py:
from vllm import LLM, SamplingParams
llm = LLM.from_pretrained("Deepseek-R1-7B-Q4_K_M")
sampling_params = SamplingParams(temperature=0.7, top_p=0.9)
outputs = llm.generate(["解释量子计算原理"], sampling_params)
print(outputs[0].outputs[0].text)
export CUDA_VISIBLE_DEVICES=0max_batch_size=16以SerpAPI为例,获取API密钥后创建search_config.json:
{
"engine": "google",
"api_key": "YOUR_API_KEY",
"location": "China"
}
使用FastAPI构建搜索服务:
from fastapi import FastAPI
from serpapi import GoogleSearch
import json
app = FastAPI()
@app.post("/search")
async def web_search(query: str):
params = {
"q": query,
"api_key": "YOUR_API_KEY",
"location": "China"
}
search = GoogleSearch(params)
results = search.get_dict()
return {"results": results["organic_results"]}
git clone https://github.com/PKU-YuanGroup/RAGFlow.git
cd RAGFlow
pip install -e .
修改config.yaml中的关键参数:
model:
name: deepseek-r1
path: ./Deepseek-R1-7B-Q4_K_M
embedding:
model: BAAI/bge-small-en-v1.5
vector_db:
type: chromadb
path: ./vector_store
from langchain.document_loaders import PyPDFLoader
loader = PyPDFLoader("tech_docs.pdf")
pages = loader.load_and_split()
采用Celery异步任务队列:
from celery import Celery
app = Celery('tasks', broker='redis://localhost:6379/0')
@app.task
def process_query(query):
# 调用搜索API
search_results = call_search_api(query)
# 生成RAG上下文
context = generate_rag_context(query, search_results)
# 模型推理
response = deepseek_infer(context)
return response
使用Postman测试端点:
/health:系统状态检查/chat:完整对话流程
{
"query": "解释变压器工作原理",
"history": [
{"user": "AI是什么?", "assistant": "人工智能是..."}
]
}
wsl -s Ubuntu-22.04(WSL2用户)git config --global http.proxy集成LLaVA实现图文理解:
from llava.model.builder import load_pretrained_model
model = load_pretrained_model("llava-v1.5-7b")
Dockerfile示例:
FROM nvidia/cuda:11.8.0-base-ubuntu22.04
RUN apt update && apt install -y python3.10 pip
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
CMD ["python", "main.py"]
本教程完整实现了从环境搭建到系统集成的全流程,通过模块化设计确保各组件可独立升级。实际部署时建议先在测试环境验证,再逐步迁移到生产环境。对于企业用户,可考虑添加日志审计、权限控制等企业级功能。