简介：本文提供Sealos私有化部署的完整流程，涵盖环境准备、安装配置、集群管理及运维优化，助力企业实现高效稳定的Kubernetes集群私有化部署。

Sealos私有化部署完全指南：从环境准备到运维管理

一、私有化部署的核心价值与适用场景

Sealos作为一款轻量级Kubernetes发行版，其私有化部署方案通过将控制权完全交给用户，解决了公有云服务存在的数据隐私、网络依赖、成本不可控等痛点。尤其适用于金融、政务、医疗等对数据主权有严格要求的行业，以及需要定制化内核参数、网络插件的复杂场景。相较于自建Kubernetes集群，Sealos私有化方案将部署周期从数周缩短至半小时内，同时提供完整的集群生命周期管理能力。

二、部署前环境准备要点

1. 硬件资源规划

基础配置：建议单节点配置不低于4核8G，存储空间根据镜像仓库规模预留（通常500GB起）
高可用架构：生产环境推荐3节点起建，节点间网络延迟需控制在5ms以内
存储方案：支持本地盘（ext4/xfs）、NFS、Ceph等多种后端，需提前规划存储类（StorageClass）

2. 操作系统要求

兼容CentOS 7.6+/Ubuntu 20.04+/Debian 10+等主流Linux发行版
必须关闭Swap分区（swapoff -a）

内核参数优化示例：

# 修改/etc/sysctl.conf
net.ipv4.ip_forward=1
net.bridge.bridge-nf-call-iptables=1
fs.may_detach_mounts=1
# 执行生效
sysctl -p

3. 网络环境配置

核心端口开放清单：
| 协议 | 端口范围 | 用途 |
|———|—————|———|
| TCP | 6443 | API Server |
| TCP | 2379-2380 | etcd |
| UDP | 8472 | Overlay网络 |
| TCP | 10250 | Kubelet |
推荐使用Calico或Cilium作为CNI插件，需提前规划Pod CIDR（如10.244.0.0/16）

三、Sealos核心组件安装流程

1. 单机快速体验版

# 下载安装脚本（以v4.2.0为例）
curl -sfL https://sealyun.oss-cn-beijing.aliyuncs.com/sealos/latest/sealos-amd64 -o sealos
chmod +x sealos && mv sealos /usr/bin
# 创建单节点集群
sealos run labring/kubernetes:v1.28.0 \
  --masters 192.168.1.10 \
  --nodes 192.168.1.11,192.168.1.12 \
  --passwd yourpassword

2. 生产环境高可用部署

2.1 etcd集群构建

# 生成TLS证书（需提前准备CA）
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem \
  -config=ca-config.json -hostname="etcd1,etcd2,etcd3" \
  -profile=kubernetes etcd-csr.json | cfssljson -bare etcd
# 启动etcd集群（各节点执行）
docker run -d --name etcd \
  --network host \
  -v /etc/etcd:/etc/etcd \
  k8s.gcr.io/etcd:3.5.4 \
  etcd --name etcd1 \
  --initial-advertise-peer-urls https://${IP}:2380 \
  --listen-peer-urls https://${IP}:2380 \
  --listen-client-urls https://${IP}:2379,https://127.0.0.1:2379 \
  --advertise-client-urls https://${IP}:2379 \
  --initial-cluster-token etcd-cluster-1 \
  --initial-cluster "etcd1=https://${IP1}:2380,etcd2=https://${IP2}:2380,etcd3=https://${IP3}:2380" \
  --initial-cluster-state new \
  --cert-file=/etc/etcd/etcd.pem \
  --key-file=/etc/etcd/etcd-key.pem \
  --trusted-ca-file=/etc/etcd/ca.pem \
  --peer-cert-file=/etc/etcd/etcd.pem \
  --peer-key-file=/etc/etcd/etcd-key.pem \
  --peer-trusted-ca-file=/etc/etcd/ca.pem

2.2 控制平面组件部署

# api-server启动参数示例
--advertise-address=${MASTER_IP}
--etcd-servers=https://${ETCD1}:2379,https://${ETCD2}:2379,https://${ETCD3}:2379
--service-cluster-ip-range=10.96.0.0/12
--enable-admission-plugins=NodeRestriction,MutatingAdmissionWebhook,ValidatingAdmissionWebhook

四、集群管理最佳实践

1. 节点管理

动态扩缩容：
```bash
添加节点
sealos join —masters new-master-ip —nodes new-node-ip

移除节点（需先drain）

kubectl drain node-name —ignore-daemonsets —delete-emptydir-data
kubectl delete node node-name


- **标签管理**：
```bash
kubectl label nodes node1 disktype=ssd
kubectl label nodes node2 zone=east

2. 存储管理

Local Volume配置示例：

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

3. 网络优化

IPVS模式配置：

# 修改kube-proxy配置
kubectl edit configmap kube-proxy -n kube-system
# 将mode: ""改为mode: "ipvs"
# 重启kube-proxy
kubectl delete pod -n kube-system -l k8s-app=kube-proxy

五、运维监控体系搭建

1. 日志收集方案

EFK栈部署：
```bash
使用Sealos应用市场一键部署
sealos apply -f https://sealyun.oss-cn-beijing.aliyuncs.com/sealos-apps/efk.yaml

文件日志采集配置示例

apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
data:
fluent.conf: |

@type tail
path /var/log/containers/.log
pos_file /var/log/es-containers.log.pos
tag kubernetes.
format json
time_key time
time_format %Y-%m-%dT%H:%M:%S.%NZ

@type elasticsearch
host elasticsearch
port 9200
logstash_format true


### 2. 性能监控指标
- **Prometheus配置要点**：
```yaml
# scrape_configs示例
- job_name: 'kubernetes-nodes'
  static_configs:
    - targets: ['192.168.1.10:9100', '192.168.1.11:9100']
  metrics_path: /metrics
  relabel_configs:
    - source_labels: [__address__]
      target_label: instance

六、安全加固方案

1. 认证授权体系

RBAC策略示例：
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: default
name: pod-reader
rules:
apiGroups: [“”]
resources: [“pods”]
verbs: [“get”, “list”, “watch”]

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: read-pods
namespace: default
subjects:

kind: User
name: alice
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
```

2. 网络策略实施

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend
spec:
  podSelector:
    matchLabels:
      app: frontend
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: api
    ports:
    - protocol: TCP
      port: 80

七、故障排查指南

1. 常见问题处理

节点NotReady状态：

# 检查kubelet日志
journalctl -u kubelet -n 100 --no-pager
# 常见原因：
# 1. 证书过期：需重新生成并分发证书
# 2. 网络连通性问题：检查防火墙规则
# 3. 资源不足：查看/var/log/messages中的OOM记录

API Server无响应：

# 检查etcd健康状态
ETCDCTL_API=3 etcdctl --endpoints=${ETCD_ENDPOINTS} \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  endpoint health
# 检查API Server审计日志
kubectl logs --namespace=kube-system kube-apiserver-${NODE_NAME} | grep -i error

2. 性能瓶颈定位

关键指标监控项：
| 指标名称 | 告警阈值 | 检查方法 |
|—————|—————|—————|
| API Server延迟 | P99>1s | prometheus: histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket[5m])) by (le)) |
| etcd操作延迟 | >50ms | etcdctl endpoint status |
| 节点磁盘IO | iowait>20% | iostat -x 1 |

八、版本升级与回滚策略

1. 灰度升级流程

# 1. 添加新版本节点
sealos join --masters new-master-ip --nodes new-node-ip --version v1.29.0
# 2. 迁移工作负载
kubectl cordon old-node
kubectl drain old-node --ignore-daemonsets
# 3. 验证新节点
kubectl get nodes -l kubernetes.io/version=v1.29.0
# 4. 移除旧节点
kubectl delete node old-node

2. 回滚操作指南

# 1. 恢复etcd数据（需提前备份）
ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \
  --name=etcd1 \
  --initial-cluster="etcd1=http://${IP1}:2380,etcd2=http://${IP2}:2380,etcd3=http://${IP3}:2380" \
  --initial-cluster-token=etcd-cluster-1 \
  --initial-advertise-peer-urls=http://${IP1}:2380 \
  --data-dir=/var/lib/etcd-backup
# 2. 重启控制平面组件
systemctl restart kube-apiserver kube-controller-manager kube-scheduler

九、总结与建议

Sealos私有化部署方案通过高度集成的安装方式和灵活的扩展能力，显著降低了Kubernetes的落地门槛。建议企业用户：

建立完善的备份机制（etcd快照每日备份）
实施基础设施即代码（IaC）管理配置
定期进行混沌工程演练（如节点宕机测试）
关注Sealos社区动态（GitHub issue跟踪）

对于超大规模集群（>1000节点），建议采用分区域部署架构，配合Service Mesh实现跨区域服务治理。实际部署中，约78%的用户选择混合部署模式（既包含虚拟机也包含物理机），这种架构在资源利用率和故障隔离方面表现优异。

Sealos私有化部署完全指南：从环境准备到运维管理

Sealos私有化部署完全指南：从环境准备到运维管理

一、私有化部署的核心价值与适用场景

二、部署前环境准备要点

1. 硬件资源规划

2. 操作系统要求

3. 网络环境配置

三、Sealos核心组件安装流程

1. 单机快速体验版

2. 生产环境高可用部署

2.1 etcd集群构建

2.2 控制平面组件部署

四、集群管理最佳实践

1. 节点管理

添加节点

移除节点（需先drain）

2. 存储管理

3. 网络优化

五、运维监控体系搭建

1. 日志收集方案

使用Sealos应用市场一键部署

文件日志采集配置示例

六、安全加固方案

1. 认证授权体系

2. 网络策略实施

七、故障排查指南

1. 常见问题处理

2. 性能瓶颈定位

八、版本升级与回滚策略

1. 灰度升级流程

2. 回滚操作指南

九、总结与建议

最热文章