EFK日志采集系统部署指南
EFK日志采集系统简介
EFK即ElasticSearch+Fluentd+Kibana。通过Fluentd在各个节点进行日志的采集汇聚到ElasticSearch中,由Kibana作前端展示。
- Elasticsearch 是一个分布式的搜索和分析引擎,可以用于全文检索、结构化检索和分析,并能将这三者结合起来。Elasticsearch 基于 Lucene 开发,现在是使用最广的开源搜索引擎之一,Wikipedia、Stack Overflow、GitHub 等都基于 Elasticsearch 来构建他们的搜索引擎。
- Fluentd是一个优秀的log信息收集的开源免费软件,目前已支持超过125种系统的log信息获取。Fluentd结合其他数据处理平台的使用,可以搭建大数据收集和处理平台,搭建商业化的解决方案。
- Kibana是一个开源的分析与可视化平台,设计出来用于和Elasticsearch一起使用的。你可以用kibana搜索、查看、交互存放在Elasticsearch索引里的数据,使用各种不同的图表、表格、地图等kibana能够很轻易地展示高级数据分析与可视化。
部署前准备工作
为了顺利完成EFK日志采集系统在CCE服务提供的Kubernetes集群部署,我们首先需要完成一些前置工作:
- 用户需要在CCE上拥有一个已经完成初始化的Kubernetes集群
- 用户已经根据指导文档能够通过kubectl正常访问集群。
创建ElasticSearch以及Fluentd用户
执行以下命令:
$ kubectl create -f es-rbac.yaml
$ kubectl create -f fluentd-es-rbac.yaml
注意: 用户在使用
es-rbac.yaml
和fluentd-es-rbac.yaml
之前,请先确认一下集群版本号,不同版本号使用的yaml文件不同。
集群版本号为1.6的用户可以使用的es-rbac.yaml
文件如下:
apiVersion: v1
kind: ServiceAccount
metadata:
name: elasticsearch
namespace: kube-system
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1alpha1
metadata:
name: elasticsearch
subjects:
- kind: ServiceAccount
name: elasticsearch
namespace: kube-system
roleRef:
kind: ClusterRole
name: view
apiGroup: rbac.authorization.k8s.io
集群版本号为1.8的用户可以使用的es-rbac.yaml
文件如下:
apiVersion: v1
kind: ServiceAccount
metadata:
name: elasticsearch
namespace: kube-system
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: elasticsearch
subjects:
- kind: ServiceAccount
name: elasticsearch
namespace: kube-system
roleRef:
kind: ClusterRole
name: view
apiGroup: rbac.authorization.k8s.io
集群版本号为1.6的用户可以使用的fluentd-es-rbac.yaml
文件如下:
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd
namespace: kube-system
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1alpha1
metadata:
name: fluentd
subjects:
- kind: ServiceAccount
name: fluentd
namespace: kube-system
roleRef:
kind: ClusterRole
name: view
apiGroup: rbac.authorization.k8s.io
集群版本号为1.8的用户可以使用的fluentd-es-rbac.yaml
文件如下:
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd
namespace: kube-system
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: fluentd
subjects:
- kind: ServiceAccount
name: fluentd
namespace: kube-system
roleRef:
kind: ClusterRole
name: view
apiGroup: rbac.authorization.k8s.io
部署Fluentd
DaemonSet fluentd-es-v1.22 只会调度到设置了标签 beta.kubernetes.io/fluentd-ds-ready=true 的 Node,需要在期望运行 fluentd 的 Node 上设置该标签;
$ kubectl get nodes
NAME STATUS AGE VERSION
192.168.1.92 Ready 12d v1.8.6
192.168.1.93 Ready 12d v1.8.6
192.168.1.94 Ready 12d v1.8.6
192.168.1.95 Ready 12d v1.8.6
$ kubectl label nodes 192.168.1.92 192.168.1.93 192.168.1.94 192.168.1.95 beta.kubernetes.io/fluentd-ds-ready=true
node "192.168.1.92" labeled
node "192.168.1.93" labeled
node "192.168.1.94" labeled
node "192.168.1.95" labeled
打上标签以后执行对应的yaml文件启动fluentd,默认是在kube-system这个namespace下。
$ kubectl create -f fluentd-es-ds.yaml
daemonset "fluentd-es-v1.22" created
$ kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE
fluentd-es-v1.22-07kls 1/1 Running 0 10s 172.18.4.187 192.168.1.94
fluentd-es-v1.22-4np74 1/1 Running 0 10s 172.18.2.162 192.168.1.93
fluentd-es-v1.22-tbh5c 1/1 Running 0 10s 172.18.3.201 192.168.1.95
fluentd-es-v1.22-wlgjb 1/1 Running 0 10s 172.18.1.187 192.168.1.92
对应的 fluentd-es-ds.yaml
文件如下:
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: fluentd-es-v1.22
namespace: kube-system
labels:
k8s-app: fluentd-es
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
version: v1.22
spec:
template:
metadata:
labels:
k8s-app: fluentd-es
kubernetes.io/cluster-service: "true"
version: v1.22
# This annotation ensures that fluentd does not get evicted if the node
# supports critical pod annotation based priority scheme.
# Note that this does not guarantee admission on the nodes (#40573).
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
serviceAccountName: fluentd
containers:
- name: fluentd-es
image: hub.baidubce.com/public/fluentd-elasticsearch:1.22
command:
- '/bin/sh'
- '-c'
- '/usr/sbin/td-agent 2>&1 >> /var/log/fluentd.log'
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
nodeSelector:
beta.kubernetes.io/fluentd-ds-ready: "true"
tolerations:
- key : "node.alpha.kubernetes.io/ismaster"
effect: "NoSchedule"
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
fluentd启动后可以至对应节点下的/var/log/fluent.log文件查看fluentd的日志有无异常,如果出现类似于unreadable之类的错误检查fluentd-es-ds.yaml挂载的目录是否完全。fluentd会从挂载的目录中采集日志,如果某个日志文件只是软链,需要挂载最初的日志文件目录位置。
部署ElasticSearch服务
首先创建相应的service用于访问elasticsearch
$kubectl create -f es-service.yaml
service "elasticsearch-logging" created
$kubectl get svc -n kube-system
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch-logging 172.16.215.15 <none> 9200/TCP 1m
对应的es-service.yaml
文件如下:
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-logging
namespace: kube-system
labels:
k8s-app: elasticsearch-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "Elasticsearch"
spec:
ports:
- port: 9200
protocol: TCP
targetPort: db
selector:
k8s-app: elasticsearch-logging
启动elsaticsearch服务,通过curl CLUSTER-IP+PORT可以判断elasticsearch服务是否正常启动。
$kubectl create -f es-controller.yaml
replicationcontroller "elasticsearch-logging-v1" created
$kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE
elasticsearch-logging-v1-0kll0 1/1 Running 0 43s 172.18.2.164 192.168.1.93
elasticsearch-logging-v1-vh17k 1/1 Running 0 43s 172.18.1.189 192.168.1.92
$curl 172.16.215.15:9200
{
"name" : "elasticsearch-logging-v1-vh17k",
"cluster_name" : "kubernetes-logging",
"cluster_uuid" : "cjvE3LJjTvic8TGCbbKxZg",
"version" : {
"number" : "2.4.1",
"build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16",
"build_timestamp" : "2016-09-27T18:57:55Z",
"build_snapshot" : false,
"lucene_version" : "5.5.2"
},
"tagline" : "You Know, for Search"
}
对应的es-controller.yaml
文件如下:
apiVersion: v1
kind: ReplicationController
metadata:
name: elasticsearch-logging-v1
namespace: kube-system
labels:
k8s-app: elasticsearch-logging
version: v1
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
replicas: 2
selector:
k8s-app: elasticsearch-logging
version: v1
template:
metadata:
labels:
k8s-app: elasticsearch-logging
version: v1
kubernetes.io/cluster-service: "true"
spec:
serviceAccountName: elasticsearch
containers:
- image: hub.baidubce.com/public/elasticsearch:v2.4.1-1
name: elasticsearch-logging
resources:
# need more cpu upon initialization, therefore burstable class
limits:
cpu: 1000m
requests:
cpu: 100m
ports:
- containerPort: 9200
name: db
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
volumeMounts:
- name: es-persistent-storage
mountPath: /data
env:
- name: "NAMESPACE"
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumes:
- name: es-persistent-storage
emptyDir: {}
部署Kibana
$kubectl create -f kibana-service.yaml
service "kibana-logging" created
$kubectl create -f kibana-controller.yaml
deployment "kibana-logging" created
$kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE
kibana-logging-1043852375-wrq6g 1/1 Running 0 48s 172.18.2.175 192.168.1.93
对应的kibana-service.yaml
文件如下:
apiVersion: v1
kind: Service
metadata:
name: kibana-logging
namespace: kube-system
labels:
k8s-app: kibana-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "Kibana"
spec:
ports:
- port: 80
protocol: TCP
targetPort: ui
selector:
k8s-app: kibana-logging
对应的kibana-controller.yaml
文件如下
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: kibana-logging
namespace: kube-system
labels:
k8s-app: kibana-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
replicas: 1
selector:
matchLabels:
k8s-app: kibana-logging
template:
metadata:
labels:
k8s-app: kibana-logging
spec:
containers:
- name: kibana-logging
image: hub.baidubce.com/public/kibana:v4.6.1-1
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
requests:
cpu: 100m
env:
- name: "ELASTICSEARCH_URL"
value: "http://elasticsearch-logging:9200"
- name: "KIBANA_BASE_URL"
value: ""
ports:
- containerPort: 5601
name: ui
protocol: TCP
kibana Pod 第一次启动时会用较长时间(10-20分钟)来优化和 Cache 状态页面,可以 tailf 该 Pod 的日志观察进度:
$ kubectl logs kibana-logging-1043852375-wrq6g -n kube-system -f
ELASTICSEARCH_URL=http://elasticsearch-logging:9200
server.basePath: /api/v1/proxy/namespaces/kube-system/services/kibana-logging
{"type":"log","@timestamp":"2017-12-04T09:54:41Z","tags":["info","optimize"],"pid":6,"message":"Optimizing and caching bundles for kibana and statusPage. This may take a few minutes"}
{"type":"log","@timestamp":"2017-12-04T10:02:20Z","tags":["info","optimize"],"pid":6,"message":"Optimization of bundles for kibana and statusPage complete in 458.61 seconds"}
{"type":"log","@timestamp":"2017-12-04T10:02:20Z","tags":["status","plugin:kibana@1.0.0","info"],"pid":6,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
访问 kibana
输入以下指令:
$kubectl get svc -n kube-system
返回结果如下所示:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kibana-logging LoadBalancer 172.16.60.222 180.76.112.7 80:32754/TCP 1m
用户可以通过LoadBalancer访问kibana服务,浏览器访问http://180.76.112.7
即可,该ip地址为kibana-logging这个service的EXTERNAL-IP。