Prometheus Operator 监控k8s组件

释放双眼,带上耳机,听听看~!
默认情况下,prometheus operator已经可以监控我们的集群,但是无法监控kube-controller-manager和kube-scheduler。 这里我们将这2个组件进行监控,并将prometheus和grafana添加traefik。通过ingress进行访问

关于operator介绍相关可以参考之前的文章

Prometheus Operator


分类文件

这里将operator文件进行分类

wget -P /root/ http://down.i4t.com/abcdocker-prometheus-operator.yaml.zip
cd /root/
unzip abcdocker-prometheus-operator.yaml.zip
mkdir kube-prom
cp -a kube-prometheus-master/manifests/* kube-prom/
cd kube-prom/
mkdir -p node-exporter alertmanager grafana kube-state-metrics prometheus serviceMonitor adapter operator
mv *-serviceMonitor* serviceMonitor/
mv setup operator/
mv grafana-* grafana/
mv kube-state-metrics-* kube-state-metrics/
mv alertmanager-* alertmanager/
mv node-exporter-* node-exporter/
mv prometheus-adapter* adapter/
mv prometheus-* prometheus/
mv 0prometheus-operator-* operator/
mv 00namespace-namespace.yaml operator/


## 安装顺序也需要改变 (之前已经安装也可以跳过)
[root@k8s-01 kube-prom]# kubectl apply -f operator/
namespace/monitoring created
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
deployment.apps/prometheus-operator created
service/prometheus-operator created
serviceaccount/prometheus-operator created

Pod启动了就可以执行剩下的
[root@k8s-01 kube-prom]# kubectl -n monitoring get pod
NAME                                   READY   STATUS    RESTARTS   AGE
prometheus-operator-69bd579bf9-7kpd7   1/1     Running   0          7s

#剩下步骤
kubectl apply -f adapter/
kubectl apply -f alertmanager/
kubectl apply -f node-exporter/
kubectl apply -f kube-state-metrics/
kubectl apply -f grafana/
kubectl apply -f prometheus/
kubectl apply -f serviceMonitor/

执行完检查没问题就可以结束了
[root@k8s-01 kube-prom]# kubectl get -n monitoring all

配置Ingress

首先需要先安装traefik,node-port方式效率不行,建议使用traefik

Kubernetes Traefik Ingress

环境初始化

首先我们需要将prometheus operator中的svc类型都修改为ClusterIP,如果默认没有修改的话,默认就是ClusterIP

[root@k8s-01 ingress]# kubectl get pod,svc -n monitoring
NAME                                       READY   STATUS    RESTARTS   AGE
pod/alertmanager-main-0                    2/2     Running   0          88s
pod/alertmanager-main-1                    2/2     Running   0          77s
pod/alertmanager-main-2                    2/2     Running   0          69s
pod/grafana-558647b59-mj85j                1/1     Running   0          96s
pod/kube-state-metrics-5bfc7db74d-kpgh2    4/4     Running   0          96s
pod/node-exporter-5kz8x                    2/2     Running   0          94s
pod/node-exporter-jnmr7                    2/2     Running   0          94s
pod/node-exporter-pztln                    2/2     Running   0          93s
pod/node-exporter-ts455                    2/2     Running   0          94s
pod/prometheus-adapter-57c497c557-6tscz    1/1     Running   0          91s
pod/prometheus-k8s-0                       3/3     Running   1          78s
pod/prometheus-k8s-1                       3/3     Running   1          78s
pod/prometheus-operator-69bd579bf9-rrf96   1/1     Running   1          98s

NAME                            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/alertmanager-main       ClusterIP   10.254.201.109           9093/TCP            99s
service/alertmanager-operated   ClusterIP   None                     9093/TCP,6783/TCP   89s
service/grafana                 ClusterIP   10.254.19.174            3000/TCP            97s
service/kube-state-metrics      ClusterIP   None                     8443/TCP,9443/TCP   96s
service/node-exporter           ClusterIP   None                     9100/TCP            95s
service/prometheus-adapter      ClusterIP   10.254.197.151           443/TCP             93s
service/prometheus-k8s          ClusterIP   10.254.120.188           9090/TCP            89s
service/prometheus-operated     ClusterIP   None                     9090/TCP            78s
service/prometheus-operator     ClusterIP   None                     8080/TCP            99s

接下来我们为prometheus ui和grafana以及alertmanager创建ingress

(可以分开写,不写在一个文件里面)

vim ingress.yaml

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: prometheus-ing
  namespace: monitoring
spec:
  rules:
  - host: prometheus.i4t.com
    http:
      paths:
      - backend:
          serviceName: prometheus-k8s
          servicePort: 9090
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: grafana-ing
  namespace: monitoring
spec:
  rules:
  - host: grafana.i4t.com
    http:
      paths:
      - backend:
          serviceName: grafana
          servicePort: 3000
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: alertmanager-ing
  namespace: monitoring
spec:
  rules:
  - host: alertmanager.i4t.com
    http:
      paths:
      - backend:
          serviceName: alertmanager-main
          servicePort: 9093


## host为域名,serviceName是prometheus的svc名称和端口
[root@k8s-01 ingress]# kubectl apply -f ingress.yaml
ingress.extensions/prometheus-operator created


[root@k8s-01 ingress]# kubectl get ingress -n monitoring
NAME               HOSTS                  ADDRESS   PORTS   AGE
alertmanager-ing   alertmanager.i4t.com             80      13s
grafana-ing        grafana.i4t.com                  80      13s
prometheus-ing     prometheus.i4t.com               80      13s

我们也可以在ui界面查看traefik

Prometheus Operator 监控k8s组件

接下来进行域名解析 (我这里使用修改host方式演示)

#mac
➜  ~ sudo vim /etc/hosts
Password:

#windows
C:\Windows\System32\drivers\etc

Prometheus Operator 监控k8s组件


监控k8s组件

这里我们可以看到,prometheus operator并没有监控到kube-controller-managerscheduler由于我这里是二进制安装,所以并没有获取到相关的信息

Prometheus Operator 监控k8s组件

这是由于serverMonitor根据label去选取svc的,我们可以看到对应的serviceMonitor选取的范围是kube-system

[root@k8s-01 manifests]#  grep -2 selector prometheus-serviceMonitorKube*
prometheus-serviceMonitorKubeControllerManager.yaml-    matchNames:
prometheus-serviceMonitorKubeControllerManager.yaml-    - kube-system
prometheus-serviceMonitorKubeControllerManager.yaml:  selector:
prometheus-serviceMonitorKubeControllerManager.yaml-    matchLabels:
prometheus-serviceMonitorKubeControllerManager.yaml-      k8s-app: kube-controller-manager
--
prometheus-serviceMonitorKubelet.yaml-    matchNames:
prometheus-serviceMonitorKubelet.yaml-    - kube-system
prometheus-serviceMonitorKubelet.yaml:  selector:
prometheus-serviceMonitorKubelet.yaml-    matchLabels:
prometheus-serviceMonitorKubelet.yaml-      k8s-app: kubelet
--
prometheus-serviceMonitorKubeScheduler.yaml-    matchNames:
prometheus-serviceMonitorKubeScheduler.yaml-    - kube-system
prometheus-serviceMonitorKubeScheduler.yaml:  selector:
prometheus-serviceMonitorKubeScheduler.yaml-    matchLabels:
prometheus-serviceMonitorKubeScheduler.yaml-      k8s-app: kube-scheduler

而kube-system默认里也没有符合标签的label

[root@k8s-01 manifests]# kubectl get svc -n kube-system
NAME                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                       AGE
kube-dns                  ClusterIP   10.254.0.2               53/UDP,53/TCP,9153/TCP        31d
kubelet                   ClusterIP   None                     10250/TCP                     2d8h
kubernetes-dashboard      NodePort    10.254.194.101           80:30000/TCP                  31d
traefik-ingress-service   NodePort    10.254.160.25            80:23633/TCP,8080:15301/TCP   38m

但是却有endpoint (我这里二进制安装有)

[root@k8s-01 manifests]# kubectl get ep -n kube-system
NAME                      ENDPOINTS                                                              AGE
kube-controller-manager                                                                    31d
kube-dns                  172.30.248.2:53,172.30.72.4:53,172.30.248.2:53 + 3 more...             31d
kube-scheduler                                                                             31d
kubelet                   192.168.0.10:10255,192.168.0.11:10255,192.168.0.12:10255 + 9 more...   2d8h
kubernetes-dashboard      172.30.232.2:8443                                                      31d
traefik-ingress-service   172.30.232.5:80,172.30.232.5:8080                                      39m

解决办法

这里创建两个管理组件的svc,将svc的label设置为k8s-app: {kube-controller-manager、kube-scheduler},这样就可以被servicemonitor选中

二进制安装解决方法

Kubernetes 1.14 二进制集群安装

创建一个svc用来绑定

apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-controller-manager
  labels:
    k8s-app: kube-controller-manager
spec:
  selector:
    component: kube-controller-manager
  type: ClusterIP
  clusterIP: None
  ports:
  - name: http-metrics
    port: 10252
    targetPort: 10252
    protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-scheduler
  labels:
    k8s-app: kube-scheduler
spec:
  selector:
    component: kube-scheduler
  type: ClusterIP
  clusterIP: None
  ports:
  - name: http-metrics
    port: 10251
    targetPort: 10251
    protocol: TCP

手动填写svc对应的ep的属性,ep的名称要和svc名称和属性对应上

apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: kube-controller-manager
  name: kube-controller-manager
  namespace: kube-system
subsets:
- addresses:
  - ip: 192.168.0.10
  - ip: 192.168.0.11
  - ip: 192.168.0.12
  ports:
  - name: http-metrics
    port: 10252
    protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: kube-scheduler
  name: kube-scheduler
  namespace: kube-system
subsets:
- addresses:
  - ip: 192.168.0.10
  - ip: 192.168.0.11
  - ip: 192.168.0.12
  ports:
  - name: http-metrics
    port: 10251
    protocol: TCP

我们查看一下svc,已经和我们ep进行绑定

[root@k8s-01 test]# kubectl get svc -n kube-system
NAME                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                       AGE
kube-controller-manager   ClusterIP   None                     10252/TCP                     64s
kube-dns                  ClusterIP   10.254.0.2               53/UDP,53/TCP,9153/TCP        31d
kube-scheduler            ClusterIP   None                     10251/TCP                     64s
kubelet                   ClusterIP   None                     10250/TCP                     2d9h
kubernetes-dashboard      NodePort    10.254.194.101           80:30000/TCP                  31d
traefik-ingress-service   NodePort    10.254.160.25            80:23633/TCP,8080:15301/TCP   126m
[root@k8s-01 test]# kubectl describe svc -n kube-system kube-scheduler
Name:              kube-scheduler
Namespace:         kube-system
Labels:            k8s-app=kube-scheduler
Annotations:       kubectl.kubernetes.io/last-applied-configuration:
                     {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"k8s-app":"kube-scheduler"},"name":"kube-scheduler","namespace"...
Selector:          component=kube-scheduler
Type:              ClusterIP
IP:                None
Port:              http-metrics  10251/TCP
TargetPort:        10251/TCP
Endpoints:         192.168.0.10:10251,192.168.0.11:10251,192.168.0.12:10251
Session Affinity:  None
Events:            

我这里master就3个所以scheduler和kube-controller-manager就只有3个

Prometheus Operator 监控k8s组件

Prometheus Operator 监控k8s组件

针对kubeadm可以参考下面的解决方法,由于我这里没有环境所以不进行演示

apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: kubelet
  name: kubelet
  namespace: kube-system
subsets:
- addresses:
  - ip: 172.16.0.14
    targetRef:
      kind: Node
      name: k8s-n2
  - ip: 172.16.0.18
    targetRef:
      kind: Node
      name: k8s-n3
  - ip: 172.16.0.2
    targetRef:
      kind: Node
      name: k8s-m1
  - ip: 172.16.0.20
    targetRef:
      kind: Node
      name: k8s-n4
  - ip: 172.16.0.21
    targetRef:
      kind: Node
      name: k8s-n5
  ports:
  - name: http-metrics
    port: 10255
    protocol: TCP
  - name: cadvisor
    port: 4194
    protocol: TCP
  - name: https-metrics
    port: 10250
    protocol: TCP

如果我们添加监控后提示ip:10251 Connection refused

  • 二进制安装

需要修改scheduler的配置文件

在启动文件中添加
--bind-address=0.0.0.0
  • kubeadm安装

需要在在修改Pod中添加,我不太了解kubeadm这里不过多说明

给TA打赏
共{{data.count}}人
人已打赏
prometheus

Prometheus Operator

2019-7-11 10:58:32

prometheus

Prometheus Operator 监控ETCD集群

2020-3-9 19:34:32

2 条回复 A文章作者 M管理员
  1. 学习啦 ✗酷酷的✗

个人中心
购物车
优惠劵
今日签到
有新私信 私信列表
搜索