简介:本文详细阐述在Ubuntu 22.04系统上安装配置Dify、Ollama及Deepseek的完整流程,涵盖环境准备、依赖安装、服务部署及调试优化等关键步骤,助力开发者快速构建AI应用开发环境。
首先通过lsb_release -a命令验证系统版本,确保为Ubuntu 22.04 LTS(Jammy Jellyfish)。该版本对Python 3.10+及Docker有良好支持,是部署AI工具链的理想选择。
执行以下命令安装基础依赖:
sudo apt updatesudo apt install -y curl wget git python3-pip python3-venv docker.io docker-compose
关键组件说明:
python3-venv:创建隔离的Python环境,避免依赖冲突docker.io:容器化部署的核心工具docker-compose:简化多容器服务编排将当前用户加入docker组以避免每次操作需sudo:
sudo usermod -aG docker $USERnewgrp docker # 立即生效
验证权限:docker run hello-world应无权限错误
Ollama是轻量级LLM服务框架,支持多种模型运行:
curl -fsSL https://ollama.com/install.sh | sh
安装后验证:
ollama version # 应显示版本信息ollama list # 查看可用模型
以部署7B参数模型为例:
ollama pull deepseek-math-7b # 下载数学专项模型ollama run deepseek-math-7b # 启动交互式服务
性能优化建议:
--gpu-layers参数指定GPU加速层数--num-gpu控制多卡并行--context-size优化长文本处理
git clone https://github.com/langgenius/dify.gitcd difypython3 -m venv venvsource venv/bin/activatepip install -r requirements.txt
修改.env文件关键参数:
# 数据库配置DATABASE_URL=postgresql://user:pass@localhost:5432/dify# Ollama服务地址OLLAMA_API_URL=http://localhost:11434# 模型选择DEFAULT_MODEL=deepseek-math-7b
sudo apt install postgresql postgresql-contribsudo -u postgres psql# 在PostgreSQL中执行:CREATE DATABASE dify;CREATE USER dify_user WITH PASSWORD 'securepass';GRANT ALL PRIVILEGES ON DATABASE dify TO dify_user;
# 开发模式(带热重载)FLASK_APP=main.py FLASK_ENV=development flask run# 生产模式(建议使用gunicorn)gunicorn -w 4 -b 0.0.0.0:8000 main:app
从HuggingFace获取模型权重:
git lfs installgit clone https://huggingface.co/deepseek-ai/DeepSeek-Math-7B
或通过Ollama直接拉取:
ollama create deepseek-math-7b \--model-file ./DeepSeek-Math-7B \--engine transformers
from transformers import Trainer, TrainingArgumentstraining_args = TrainingArguments(output_dir="./results",per_device_train_batch_size=4,num_train_epochs=3,learning_rate=2e-5,fp16=True # 启用混合精度训练)trainer = Trainer(model=model,args=training_args,train_dataset=train_dataset)trainer.train()
4bit量化示例:
from optimum.gptq import GPTQForCausalLMquantized_model = GPTQForCausalLM.from_pretrained("deepseek-ai/DeepSeek-Math-7B",device_map="auto",torch_dtype=torch.float16,quantization_config={"bits": 4, "desc_act": False})
curl -X POST http://localhost:8000/api/v1/chat/completions \-H "Content-Type: application/json" \-d '{"model": "deepseek-math-7b","messages": [{"role": "user", "content": "求解方程x²+2x-3=0"}]}'
预期响应应包含两个实数解。
使用htop和nvidia-smi监控资源占用:
watch -n 1 nvidia-smi -l 1 # 每秒刷新GPU状态
关键指标:
当出现ERROR: pip's dependency resolver时:
pip install package==x.y.zpip check验证依赖完整性错误示例:RuntimeError: Error(s) in loading state_dict
解决方案:
strict=False参数:
model.load_state_dict(torch.load(path), strict=False)
model.gradient_checkpointing_enable()deepspeed进行分布式训练batch_size和gradient_accumulation_stepsdocker-compose.yml示例:
version: '3'services:dify:image: dify-custombuild: .ports:- "8000:8000"environment:- OLLAMA_API_URL=http://ollama:11434ollama:image: ollama/ollamaports:- "11434:11434"volumes:- ./models:/models
Nginx配置示例:
upstream dify_servers {server dify1:8000 weight=3;server dify2:8000 weight=2;}server {listen 80;location / {proxy_pass http://dify_servers;proxy_set_header Host $host;}}
Prometheus配置示例:
scrape_configs:- job_name: 'dify'static_configs:- targets: ['dify:8000']metrics_path: '/metrics'
关键监控指标:
request_latency_secondsmodel_inference_timegpu_memory_usage_bytes本指南完整覆盖了从环境搭建到生产部署的全流程,通过模块化设计和容器化方案,可灵活适应不同规模的AI应用场景。实际部署时建议先在测试环境验证所有组件,再逐步迁移到生产环境。