Now we are going to deploy single instance of Prometheus. Normally you would / should deploy multiple instances spread through cluster. For example one instance dedicated to monitor just Kubernetes API, next dedicated to monitor nodes and so on... as with many things in Kubernetes world there is no specification how things should look like 🙂 So to save resources we will deploy just one.
We should be back in our
monitoring folder, create new folder called
prometheus and in it create following files:
apiVersion: monitoring.coreos.com/v1 kind: Prometheus metadata: name: prometheus-persistant namespace: monitoring spec: replicas: 1 retention: 7d resources: requests: memory: 400Mi nodeSelector: node-type: worker securityContext: fsGroup: 2000 runAsNonRoot: true runAsUser: 1000 serviceAccountName: prometheus serviceMonitorSelector: matchExpressions: - key: name operator: In values: - longhorn-prometheus-servicemonitor - kube-state-metrics - node-exporter - kubelet - traefik storage: volumeClaimTemplate: spec: accessModes: - ReadWriteOnce storageClassName: longhorn resources: requests: storage: 20Gi
retention - How long to keep data
nodeSelector: -> node-type: worker If you followed my setup I set some tags for worker nodes and control plane. Here I just say to prefer nodes with tag: worker. I use this just because I didn't want to tax more the control nodes. If it does not matter to you where the Prometheus is running remove this two lines.
serviceMonitorSelector: matchExpressions: - key: name operator: In values: - longhorn-prometheus-servicemonitor - kube-state-metrics - node-exporter - kubelet - traefik
Here are the Service Monitors we created defined, this part tells Prometheus to go and take data from them. If you add more or less, edit this part.
storage: We will just tell what provisioner to use and how many GB to provision, our Prometheus Operator will take care of mounting and assigning the storage for persistent data. Make sure the longhorn is default. ( I mentioned how to made it default storage provider in longhorn section)
I will keep data for 7 days ( I didn't know how much data it will generate at the time ), but logging for full 7 days it produced 6.62GB in my case... so 20G is safe bet. So we create physical volume claim from our Longhorn. (man I love Longhorn, so easy to work with)
apiVersion: v1 kind: Service metadata: name: prometheus-external namespace: monitoring spec: selector: prometheus: prometheus-persistant type: LoadBalancer ports: - name: web protocol: TCP port: 9090 targetPort: web loadBalancerIP: 192.168.0.235
If you followed my guides you know I like to keep most of the services on their own IP, so above I told MetalLB to give Prometheus instance IP 192.168.0.235
apiVersion: v1 kind: Service metadata: name: prometheus namespace: monitoring spec: ports: - name: web port: 9090 targetPort: web selector: prometheus: prometheus-persistant sessionAffinity: ClientIP
Make the Prometheus available also locally in cluster under name
prometheus and port called
apiVersion: v1 kind: ServiceAccount metadata: name: prometheus namespace: monitoring
Just service account for Prometheus, this provide our pod with "identity", later bellow we add permissions to this account to look into other namespaces and collect data etc...
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: prometheus roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: prometheus subjects: - kind: ServiceAccount name: prometheus namespace: monitoring
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: prometheus namespace: monitoring rules: - apiGroups: [""] resources: - nodes - services - endpoints - pods - nodes/metrics verbs: ["get", "list", "watch"] - apiGroups: [""] resources: - configmaps verbs: ["get"] - nonResourceURLs: ["/metrics"] verbs: ["get"]
RBAC gives permissions to access various network resource in cluster more here: RBAC
Jump out of the
prometheus folder and apply it to cluster.
cd .. kubectl apply -f prometheus/
Check if Prometheus deployed. I have called my deployment
prometheus-persistant but didn't know that Prometheus Operator adds prometheus- before and -(number) behind.
root@control01:/home/ubuntu/monitoring# kubectl get pods -n monitoring NAME READY STATUS RESTARTS AGE . . . prometheus-prometheus-persistant-0 2/2 Running 1 13d
You should now be able to connect via browser to IP of Prometheus. In my case 192.168.0.235:9090 and see something like this:
Also look into
Status -> Targets there should be all metrics
UP It can take few minutes before they are scraped but certainly no longer than 5min... I had wrongly configured permissions and got 403 error for Kubelet, when I fixed it it got back to UP mode within a minute.
For our graphing purposes we will use Grafana, but you can also get graphs in Prometheus.
Easiest is to switch to
Classic UI and you under the top input box, right next to
Execute is a drop down menu where you can chose any of the metrics.. than hit execute and you should have some data.
We are almost there last step is Grafana for some kick ass dashboards Grafana
Did it work for you, take a break, drink some good beverage of your choice and if you think I was helpful maybe get one drink for me as well 🙂