Using InfluxDB for Horizaontal Pod AutoScaling using Custom Metrics - kubernetes

I have the TICK stack deployed in my Kubernetes cluster for monitoring purposes. My application pushes its custom data to it.
I have tried horizontal pod autoscaling using custom metrics with the help of the Prometheus adapter. I was curious if there is such an adapter for InfluxDB as well?
The Kubernetes popular custom metrics adapters do not include the InfluxDB one. Is there a way I can use my current infrastructure(containing InfluxDB) to autoscale pods using custom metrics from my application?

Why not use custom metrics from Prometheus with influxdb-exporter? I don't see why it should not work.

It is possible to use influxdb with heapster, in attachments some files that I set up to use in an easy way.
First run influxdb.yaml
Run second heapster-rbac.yaml
Third run heapster.yaml
**INFLUXDB.YAML**
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: monitoring-influxdb
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
task: monitoring
k8s-app: influxdb
spec:
containers:
- name: influxdb
image: k8s.gcr.io/heapster-influxdb-amd64:v1.5.2
volumeMounts:
- mountPath: /data
name: influxdb-storage
volumes:
- name: influxdb-storage
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
labels:
task: monitoring
# If you are NOT using this as an addon, you should comment out this line.
kubernetes.io/cluster-service: 'true'
kubernetes.io/name: monitoring-influxdb
name: monitoring-influxdb
namespace: kube-system
spec:
ports:
- port: 8086
targetPort: 8086
selector:
k8s-app: influxdb
**heapster-rbac.yaml**
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: heapster
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: heapster
namespace: kube-system
**heapster.yaml**
apiVersion: v1
kind: ServiceAccount
metadata:
name: heapster
namespace: kube-system
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: heapster
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
task: monitoring
k8s-app: heapster
spec:
serviceAccountName: heapster
containers:
- name: heapster
image: k8s.gcr.io/heapster-amd64:v1.5.4
imagePullPolicy: IfNotPresent
command:
- /heapster
- --source=kubernetes:https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true
- --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
---
apiVersion: v1
kind: Service
metadata:
labels:
task: monitoring
# If you are NOT using this as an addon, you should comment out this line.
kubernetes.io/cluster-service: 'true'
kubernetes.io/name: Heapster
name: heapster
namespace: kube-system
spec:
ports:
- port: 80
targetPort: 8082
selector:
k8s-app: heapster

InfluxDB can be used with Heapster(as pointed out by #LucasSales) but it is deprecated in the current versions of Kubernetes.
For the latest versions of Kubernetes we have the metrics server for basic CPU/memory metrics. Prometheus is the accepted third party monitoring tool especially for things like custom metrics.

Related

Kubernetes metrics server API

I am running a Kuberentes cluster in dev environment. I executed deployment files for metrics server, my pod is up and running without any error message. See the output here:
root#master:~/pre-release# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
metrics-server-568697d856-9jshp 1/1 Running 0 10m 10.244.1.5 worker-1 <none> <none>
Next when I am checking API service status, it shows up as below
Name: v1beta1.metrics.k8s.io
Namespace:
Labels: k8s-app=metrics-server
Annotations: <none>
API Version: apiregistration.k8s.io/v1
Kind: APIService
Metadata:
Creation Timestamp: 2021-03-29T17:32:16Z
Resource Version: 39213
UID: 201f685d-9ef5-4f0a-9749-8004d4d529f4
Spec:
Group: metrics.k8s.io
Group Priority Minimum: 100
Insecure Skip TLS Verify: true
Service:
Name: metrics-server
Namespace: pre-release
Port: 443
Version: v1beta1
Version Priority: 100
Status:
Conditions:
Last Transition Time: 2021-03-29T17:32:16Z
Message: failing or missing response from https://10.105.171.253:443/apis/metrics.k8s.io/v1beta1: Get "https://10.105.171.253:443/apis/metrics.k8s.io/v1beta1": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Reason: FailedDiscoveryCheck
Status: False
Type: Available
Events: <none>
Here the metric server deployment code
containers:
- args:
- --cert-dir=/tmp
- --secure-port=443
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
- --kubelet-use-node-status-port
image: k8s.gcr.io/metrics-server/metrics-server:v0.4.2
Here the complete code
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: pre-release
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-view: "true"
name: system:aggregated-metrics-reader
rules:
- apiGroups:
- metrics.k8s.io
resources:
- pods
- nodes
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
rules:
- apiGroups:
- ""
resources:
- pods
- nodes
- nodes/stats
- namespaces
- configmaps
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server-auth-reader
namespace: pre-release
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: pre-release
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: pre-release
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: pre-release
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: pre-release
spec:
ports:
- name: https
port: 443
protocol: TCP
targetPort: https
selector:
k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: pre-release
spec:
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxUnavailable: 0
template:
metadata:
labels:
k8s-app: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=443
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
- --kubelet-use-node-status-port
image: k8s.gcr.io/metrics-server/metrics-server:v0.4.2
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /livez
port: https
scheme: HTTPS
periodSeconds: 10
name: metrics-server
ports:
- containerPort: 443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: https
scheme: HTTPS
periodSeconds: 10
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- mountPath: /tmp
name: tmp-dir
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-cluster-critical
serviceAccountName: metrics-server
volumes:
- emptyDir: {}
name: tmp-dir
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
labels:
k8s-app: metrics-server
name: v1beta1.metrics.k8s.io
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: pre-release
version: v1beta1
versionPriority: 100
latest error
I0330 09:02:31.705767 1 secure_serving.go:116] Serving securely on [::]:4443
E0330 09:04:01.718135 1 manager.go:111] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:worker-2: unable to fetch metrics from Kubelet worker-2 (worker-2): Get https://worker-2:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup worker-2 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:master: unable to fetch metrics from Kubelet master (master): Get https://master:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup master on 10.96.0.10:53: read udp 10.244.2.23:41419->10.96.0.10:53: i/o timeout, unable to fully scrape metrics from source kubelet_summary:worker-1: unable to fetch metrics from Kubelet worker-1 (worker-1): Get https://worker-1:10250/stats/summary?only_cpu_and_memory=true: dial tcp: i/o timeout]
Could someone please help me to fix the issue.
Following container arguments work for me in our development cluster
containers:
- args:
- /metrics-server
- --cert-dir=/tmp
- --secure-port=443
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
Result of kubectl describe apiservice v1beta1.metrics.k8s.io:
Status:
Conditions:
Last Transition Time: 2021-03-29T19:19:20Z
Message: all checks passed
Reason: Passed
Status: True
Type: Available
Give it a try.
As I mentioned in the comment section, this may be fixed by adding hostNetwork:true to the metrics-server Deployment.
According to kubernetes documentation:
HostNetwork - Controls whether the pod may use the node network namespace. Doing so gives the pod access to the loopback device, services listening on localhost, and could be used to snoop on network activity of other pods on the same node.
spec:
hostNetwork: true <---
containers:
- args:
- /metrics-server
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP
- --kubelet-use-node-status-port
- --kubelet-insecure-tls
There is an example with information how can you edit your deployment to include that hostNetwork:true in your metrics-server deployment.
Also related github issue.

Kubernetes API to create a CRD using Minikube, with deployment pod in pending state

I have a problem with Kubernetes API and CRD, while creating a deployment with a single nginx pod, i would like to access using port 80 from a remote server, and locally as well. After seeing the pod in a pending state and running the kubectl get pods and then after around 40 seconds on average, the pod disappears, and then a different nginx pod name is starting up, this seems to be in a loop.
The error is
* W1214 23:27:19.542477 1 requestheader_controller.go:193] Unable to get configmap/extension-apiserver-authentication in kube-system. Usually fixed by 'kubectl create rolebinding -n kube-system ROLEBINDING_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA'
I was following this article about service accounts and roles,
https://thorsten-hans.com/custom-resource-definitions-with-rbac-for-serviceaccounts#create-the-clusterrolebinding
I am not even sure i have created this correctly?
Do i even need to create the ServiceAccount_v1.yaml, PolicyRule_v1.yaml and ClusterRoleBinding.yaml files to resolve my error above.
All of my .yaml files for this are below,
CustomResourceDefinition_v1.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
# name must match the spec fields below, and be in the form: <plural>.<group>
name: webservers.stable.example.com
spec:
# group name to use for REST API: /apis/<group>/<version>
group: stable.example.com
names:
# kind is normally the CamelCased singular type. Your resource manifests use this.
kind: WebServer
# plural name to be used in the URL: /apis/<group>/<version>/<plural>
plural: webservers
# shortNames allow shorter string to match your resource on the CLI
shortNames:
- ws
# singular name to be used as an alias on the CLI and for display
singular: webserver
# either Namespaced or Cluster
scope: Cluster
# list of versions supported by this CustomResourceDefinition
versions:
- name: v1
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
image:
type: string
replicas:
type: integer
# Each version can be enabled/disabled by Served flag.
served: true
# One and only one version must be marked as the storage version.
storage: true
Deployments_v1_apps.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
# Unique key of the Deployment instance
name: nginx-deployment
spec:
# 1 Pods should exist at all times.
replicas: 1
selector:
matchLabels:
app: nginx
strategy:
rollingUpdate:
maxSurge: 100
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
labels:
# Apply this label to pods and default
# the Deployment label selector to this value
app: nginx
spec:
containers:
# Run this image
- image: nginx:1.14
name: nginx
ports:
- containerPort: 80
hostname: nginx
nodeName: webserver01
securityContext:
runAsNonRoot: True
#status:
#availableReplicas: 1
Ingress_v1_networking.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: nginx-ingress
spec:
rules:
- http:
paths:
- path: /
pathType: Exact
backend:
resource:
kind: nginx-service
name: nginx-deployment
#service:
# name: nginx
# port: 80
#serviceName: nginx
#servicePort: 80
Service_v1_core.yaml
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- port: 80
protocol: TCP
targetPort: 80
ServiceAccount_v1.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: user
namespace: example
PolicyRule_v1.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: "example.com:webservers:reader"
rules:
- apiGroups: ["example.com"]
resources: ["ResourceAll"]
verbs: ["VerbAll"]
ClusterRoleBinding_v1.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: "example.com:webservers:cdreader-read"
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: "example.com:webservers:reader"
subjects:
- kind: ServiceAccount
name: user
namespace: example

How to monitor external service in prometheus-operator

I am trying to monitor external service (which is exporter of cassandra metrics) in prometheus-operator. I installed prometheus-operator using helm 2.11.0. I installed it using this yaml:
apiVersion: v1
kind: ServiceAccount
metadata:
name: tiller
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: tiller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: tiller
namespace: kube-system
and these commands on my kubernetes cluster:
kubectl create -f rbac-config.yml
helm init --service-account tiller --history-max 200
helm install stable/prometheus-operator --name prometheus-operator --namespace monitoring
Next, basing on article:
how monitor to an external service
I tried to do steps described in it. As suggested I created Endpoints, Service and ServiceMonitor with label for existing Prometheus. Here are my yaml files:
apiVersion: v1
kind: Endpoints
metadata:
name: cassandra-metrics80
labels:
app: cassandra-metrics80
subsets:
- addresses:
- ip: 10.150.1.80
ports:
- name: web
port: 7070
protocol: TCP
apiVersion: v1
kind: Service
metadata:
name: cassandra-metrics80
namespace: monitoring
labels:
app: cassandra-metrics80
release: prometheus-operator
spec:
externalName: 10.150.1.80
ports:
- name: web
port: 7070
protocol: TCP
targetPort: 7070
type: ExternalName
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: cassandra-metrics80
labels:
app: cassandra-metrics80
release: prometheus-operator
spec:
selector:
matchLabels:
app: cassandra-metrics80
release: prometheus-operator
namespaceSelector:
matchNames:
- monitoring
endpoints:
- port: web
interval: 10s
honorLabels: true
And in prometheus service discovery page I can see:
That this service is not active and all labels are dropped.
I did a numerous things trying to fix this, like setting targetLabels. Trying to relabel the once that are discovered, as
described here: prometheus relabeling
But unfortunately nothing works. What could be the issue or how can I investigate it better?
Ok, I found out that service should be in the same namespace as service monitor and endpoint, after that prometheus started to see some metrics from cassandra.
When using the kube-prometheus-stack helm chart, it can be done as follows
prometheus:
prometheusSpec:
additionalScrapeConfigs:
- job_name: external
metrics_path: /metrics
static_configs:
- targets:
- <IP>:<PORT>
To be strict only "Endpoints" and "Service" should be in the same namespace.
Additionally "Endpoints" and "Service" should have the same name as Lucas mentioned it before.
ServiceMonitor can be placed anywhere, it finds and scrapes SVC/Endpoint inside defined namespaces (namespaceSelector->matchNames) and matching all labels (selector->matchLabels):
spec:
selector:
matchLabels:
app: cassandra-metrics80
release: prometheus-operator
namespaceSelector:
matchNames:
- my-namespace
Furthermore now there is much more easier method to define additional scraping:
https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/additional-scrape-config.md
The only drawback for the second one is that it requires pod restart after the change. Configuration based on Endpoint/Service/ServiceMonitor seem to be discovered and applied automatically.

How does Kubernetes control replication?

I was curious about how Kubernetes controls replication. I my config yaml file specifies I want three pods, each with an Nginx server, for instance (from here -- https://kubernetes.io/docs/concepts/workloads/controllers/replicationcontroller/#how-a-replicationcontroller-works)
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx
spec:
replicas: 3
selector:
app: nginx
template:
metadata:
name: nginx
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
How does Kubernetes know when to shut down pods and when to spin up more? For example, for high traffic loads, I'd like to spin up another pod, but I'm not sure how to configure that in the YAML file so I was wondering if Kubernetes has some behind-the-scenes magic that does that for you.
Kubernetes does no magic here - from your configuration, it does simply not know nor does it change the number of replicas.
The concept you are looking for is called an Autoscaler. It uses metrics from your cluster (need to be enabled/installed as well) and can then decide, if Pods must be scaled up or down and will in effect change the number of replicas in the deployment or replication controller. (Please use a deployment, not replication controller, the later does not support rolling updates of your applications.)
You can read more about the autoscaler here: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/
You can use HorizontalPodAutoscaler along with deployment as below. This will autoscale your pod declaratively based on target CPU utilization.
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: $DEPLOY_NAME
spec:
replicas: 2
template:
metadata:
labels:
app: $DEPLOY_NAME
spec:
containers:
- name: $DEPLOY_NAME
image: $DEPLOY_IMAGE
imagePullPolicy: Always
resources:
requests:
cpu: "0.2"
memory: 256Mi
limits:
cpu: "1"
memory: 1024Mi
---
apiVersion: v1
kind: Service
metadata:
name: $DEPLOY_NAME
spec:
selector:
app: $DEPLOY_NAME
ports:
- port: 8080
type: ClusterIP
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: $DEPLOY_NAME
namespace: $K8S_NAMESPACE
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: $DEPLOY_NAME
minReplicas: 2
maxReplicas: 6
targetCPUUtilizationPercentage: 60

How to configure a non-default serviceAccount on a deployment

My Understanding of this doc page is, that I can configure service accounts with Pods and hopefully also deployments, so I can access the k8s API in Kubernetes 1.6+. In order not to alter or use the default one I want to create service account and mount certificate into the pods of a deployment.
How do I achieve something similar like in this example for a deployment?
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
serviceAccountName: build-robot
automountServiceAccountToken: false
As you will need to specify 'podSpec' in Deployment as well, you should be able to configure the service account in the same way. Something like:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: my-deployment
spec:
template:
# Below is the podSpec.
metadata:
name: ...
spec:
serviceAccountName: build-robot
automountServiceAccountToken: false
...
kubernetes nginx-deployment.yaml where serviceAccountName: test-sa
used as non default service account
Link: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/
test-sa.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: test-sa
namespace: test-ns
nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
namespace: test-ns
spec:
strategy:
type: Recreate
selector:
matchLabels:
app: nginx
replicas: 1 # tells deployment to run 1 pods matching the template
template: # create pods using pod definition in this template
metadata:
labels:
app: nginx
spec:
serviceAccountName: test-sa
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80