简介:本文详细介绍如何通过Harbor API实现批量删除镜像的完整流程,涵盖环境准备、API调用、脚本编写及异常处理等关键环节,帮助运维人员高效管理镜像仓库。
在DevOps实践中,Harbor作为企业级镜像仓库,其存储空间管理直接影响CI/CD流水线效率。当项目迭代频繁时,未及时清理的旧镜像会快速占用存储资源,导致以下问题:
传统手动删除方式(通过Web界面逐个操作)存在明显缺陷:当需要删除数百个镜像时,操作耗时超过2小时,且容易遗漏关键镜像。而通过Harbor API实现批量操作,可将处理时间缩短至分钟级,并确保操作可追溯。
Harbor采用基于JWT的Bearer Token认证,获取流程如下:
# 获取认证Tokencurl -u "username:password" -X POST "https://harbor-domain/api/v2.0/users/current/sessions" -H "accept: application/json"
响应示例:
{"token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9..."}
建议将Token存储在环境变量中:
export HARBOR_TOKEN="eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9..."
首先通过/api/v2.0/projects/{project_id}/artifacts接口获取镜像列表:
curl -X GET "https://harbor-domain/api/v2.0/projects/1/artifacts?with_tag=true" \-H "accept: application/json" \-H "Authorization: Bearer $HARBOR_TOKEN"
关键响应字段解析:
id: 镜像唯一标识tags: 标签数组(含name和digest)repository: 仓库路径(格式:library/nginx)Python示例脚本(需安装requests库):
import requestsimport jsonHARBOR_URL = "https://harbor-domain"PROJECT_ID = 1TOKEN = "your_token_here"def get_artifacts():url = f"{HARBOR_URL}/api/v2.0/projects/{PROJECT_ID}/artifacts?with_tag=true"headers = {"Authorization": f"Bearer {TOKEN}"}response = requests.get(url, headers=headers, verify=False)return response.json()def delete_artifact(digest):url = f"{HARBOR_URL}/api/v2.0/projects/{PROJECT_ID}/artifacts/{digest}"headers = {"Authorization": f"Bearer {TOKEN}"}response = requests.delete(url, headers=headers, verify=False)return response.status_code# 主逻辑artifacts = get_artifacts()for artifact in artifacts:for tag in artifact['tags']:if 'old-version' in tag['name']: # 删除条件示例status = delete_artifact(tag['digest'])print(f"Deleted {tag['name']}: Status {status}")
实现更复杂的删除逻辑:
/api/v2.0/projects/{project_id}/artifacts/search接口结合q参数
curl -X GET "https://harbor-domain/api/v2.0/projects/1/artifacts/search?q=push_time<2023-01-01" \-H "Authorization: Bearer $HARBOR_TOKEN"
import repattern = re.compile(r'^v\d+\.\d+\.\d+$') # 匹配语义化版本if pattern.match(tag['name']):# 执行删除
| 错误码 | 原因 | 解决方案 |
|---|---|---|
| 401 | Token过期 | 重新获取认证 |
| 403 | 权限不足 | 检查项目角色 |
| 404 | 镜像不存在 | 确认digest值 |
| 429 | 请求过载 | 实现指数退避算法 |
import concurrent.futureswith concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:futures = [executor.submit(delete_artifact, tag['digest'])for artifact in artifacts for tag in artifact['tags']]
page和page_size参数
curl -X GET "https://harbor-domain/api/v2.0/projects/1/artifacts?page=1&page_size=100"
/api/v2.0/projects/{project_id}/artifacts/{digest}/labels添加备份标签)准备阶段:
执行阶段:
# 1. 获取待删除镜像列表python get_artifacts.py > artifacts.json# 2. 过滤需要删除的镜像jq '.[] | select(.tags[].name | test("^test-"))' artifacts.json > to_delete.json# 3. 执行批量删除python batch_delete.py --input to_delete.json
验证阶段:
在Jenkinsfile中添加清理步骤:
pipeline {stages {stage('Clean Old Images') {steps {sh '''curl -s -X DELETE "https://harbor-domain/api/v2.0/projects/1/artifacts/${IMAGE_DIGEST}" \-H "Authorization: Bearer ${HARBOR_TOKEN}"'''}}}}
通过/api/v2.0/projects获取所有项目ID后循环处理:
projects = requests.get(f"{HARBOR_URL}/api/v2.0/projects",headers={"Authorization": f"Bearer {TOKEN}"}).json()for project in projects:if project['name'].startswith('dev-'):# 执行特定项目清理
通过Harbor API实现批量删除镜像,可将原本需要数小时的手工操作缩短至分钟级完成。实际测试数据显示,在1000个镜像的场景下:
未来Harbor API可能增强以下功能:
建议运维团队建立定期清理机制(如每周执行一次),并结合标签策略(如keep:true标签)实现自动化管理。对于超大规模环境,可考虑开发专门的Harbor管理工具,集成更复杂的业务规则引擎。