I am trying to run this prometheus rules out of a yaml file and I am getting this error. I have serach all over the web and since it is quite long and as well very specific to elastic. I have not been able to find any sort of documentation, help or advise. I would really appreciate any guidance or someone pointing me at the right direction.
kubectl apply -f prometheus-es-rules.yaml -n centralized-logging
The "prometheusrules" is invalid:
* : 43:11: group "elastic", rule 1, "elasticsearch_filesystem_data_used_percent": could not parse expression: 1:71: parse error: unexpected identifier "elasticsearch" in label matching, expected "," or "}"
* : 43:11: group "elastic", rule 2, "elasticsearch_filesystem_data_free_percent": could not parse expression: 1:72: parse error: unexpected identifier "elasticsearch" in label matching, expected "," or "}"
* : 43:11: group "elastic", rule 3, "ElasticsearchTooFewNodesRunning": could not parse expression: 1:68: parse error: unexpected identifier "elasticsearch" in label matching, expected "," or "}"
* : 43:11: group "elastic", rule 4, "ElasticsearchHeapTooHigh": could not parse expression: 1:59: parse error: unexpected identifier "elasticsearch" in label matching, expected "," or "}"
This is my promethus-rules.yaml file:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
role: alert-rules
app: elastic
name: prometheus-rules
spec:
groups:
- name: elastic
rules:
- record: elasticsearch_filesystem_data_used_percent
expr: |
100 * (elasticsearch_filesystem_data_size_bytes{service="{{ template "elasticsearch-exporter.fullname" . }}"} - elasticsearch_filesystem_data_free_bytes{service="{{ template "elasticsearch-exporter.fullname" . }}"})
/ elasticsearch_filesystem_data_size_bytes{service="{{ template "elasticsearch-exporter.fullname" . }}"}
- record: elasticsearch_filesystem_data_free_percent
expr: 100 - elasticsearch_filesystem_data_used_percent{service="{{ template "elasticsearch-exporter.fullname" . }}"}
- alert: ElasticsearchTooFewNodesRunning
expr: elasticsearch_cluster_health_number_of_nodes{service="{{ template "elasticsearch-exporter.fullname" . }}"} < 3
for: 5m
labels:
severity: critical
annotations:
description: There are only {{ "{{ $value }}" }} < 3 ElasticSearch nodes running
summary: ElasticSearch running on less than 3 nodes
- alert: ElasticsearchHeapTooHigh
expr: |
elasticsearch_jvm_memory_used_bytes{service="{{ template "elasticsearch-exporter.fullname" . }}", area="heap"} / elasticsearch_jvm_memory_max_bytes{service="{{ template "elasticsearch-exporter.fullname" . }}", area="heap"}
> 0.9
for: 15m
labels:
severity: critical
annotations:
description: The heap usage is over 90% for 15m
summary: ElasticSearch node {{ "{{ $labels.node }}" }} heap usage is high
- alert: ElasticNoAvailableSpace
expr: es_fs_path_free_bytes * 100 / es_fs_path_total_bytes < 10
for: 10m
labels:
severity: critical
annotations:
summary: Instance {{$labels.instance}}
description: Elasticsearch reports that there are only {{ $value }}% left on {{ $labels.path }} at {{$labels.instance}}. Please check it
- alert: NumberOfPendingTasks
expr: es_cluster_pending_tasks_number > 0
for: 5m
labels:
severity: warning
annotations:
summary: Instance {{ $labels.instance }}
description: Number of pending tasks for 10 min. Cluster works slowly
I am not sure if this is an indentation problem?
Related
I am using kube-prometheus-stack and the yaml snippets you see below are part of a PrometheusRule definition.
This is a completely hypothetical scenario, the simplest one I could think of that illustrates my point.
Given this kind of metric:
cpu_usage{job="job-1", must_be_lower_than="50"} 33.72
cpu_usage{job="job-2", must_be_lower_than="80"} 56.89
# imagine there are plenty more lines here
# with various different values for the must_be_lower_than label
# ...
I'd like to have alerts that check the label must_be_lower_than and alert. Something like this (this doesn't work the way it's written now, just trying to demonstrate):
alert: CpuUsageTooHigh
annotations:
message: 'On job {{ $labels.job }}, the cpu usage has been above {{ $labels.must_be_lower_than }}% for 5 minutes.'
expr: cpu_usage > $must_be_lower_than
for: 5m
P.S I already know I can define alerts like this:
alert: CpuUsageTooHigh50
annotations:
message: 'On job {{ $labels.job }}, the cpu usage has been above 50% for 5 minutes.'
expr: cpu_usage{must_be_lower_than="50"} > 50
for: 5m
---
alert: CpuUsageTooHigh80
annotations:
message: 'On job {{ $labels.job }}, the cpu usage has been above 80% for 5 minutes.'
expr: cpu_usage{must_be_lower_than="80"} > 80
for: 5m
This is not what I'm looking for, because I have to manually define alerts for some of the various values of the must_be_lower_than label.
There is currently no way in Prometheus to have this kind of "templating".
The only way to get something near would be to use recording rules that that define the maximum value for the label:
rules:
- record: max_cpu_usage
expr: vector(50)
labels:
must_be_lower_than:"50"
- record: max_cpu_usage
expr: vector(80)
labels:
must_be_lower_than:"80"
# ... other possible values
Then use it in your alerting rule:
alert: CpuUsageTooHigh
annotations:
message: 'On job {{ $labels.job }}, the cpu usage has been above {{ $labels.must_be_lower_than}}% for 5 minutes.'
expr: cpu_usage > ON(must_be_lower_than) GROUP_LEFT max_cpu_usage
for: 5m
I'm trying to iterate over map in Helm chart to create multiple Kubernetes Cronjobs. Since I had trouble generating multiple manifests from a single template I used '---' to separate the manifests. Otherwise it kept generating only a one manifest.
{{- range $k, $job := .Values.Jobs }}
{{- if $job.enabled }}
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: {{ $job.name }}
namespace: {{ $.Release.Namespace }}
spec:
schedule: {{ $job.schedule }}
startingDeadlineSeconds: xxx
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: x
failedJobsHistoryLimit: x
jobTemplate:
spec:
template:
spec:
containers:
- name: {{ $job.name }}
image: {{ $.Values.cronJobImage }}
command:
- /bin/sh
- -c
- curl {{ $.Values.schedulerBaseUrl }}/{{ $job.url }}
restartPolicy: Never
---
{{- end }}
{{ end }}
values.yaml
Jobs:
- name: "xxx-job"
enabled: true
schedule: "00 18 * * *"
url: "jobs/xxx"
- name: "xxx-job"
enabled: true
schedule: "00 18 * * *"
url: "jobs/xxx"
From this way it works and generates all the Jobs defined in the values.yaml. I was wandering is there an any better way to do this?
I have the same situation and am having trouble writing tests; we are using https://github.com/quintush/helm-unittest/blob/master/DOCUMENT.md for unit tests.
The problem is, which document index, in this case, do we have to use to separate manifest? For example, in the case above, iterate four times, and in two cases will fail!
I want to create a helm chart that results in a config map that looks like this:
apiVersion: v1
kind: ConfigMap
metadata:
name: myconfigmap
data:
myconfigfile1.properties: |
property11 = value11
property12 = value12
myconfigfile1.properties: |
property21 = value21
property22 = value22
whereas this part shall be configurable in the values.yaml:
myconfig:
myconfigfile1.properties: |
property11 = value11
property12 = value12
myconfigfile1.properties: |
property21 = value21
property22 = value22
Now I want to iterate over all the children of myconfig in the values.yaml and add them to my helm template. My attempts so far with this template:
apiVersion: v1
kind: ConfigMap
metadata:
name: myconfigmap
data:
# {{- range $key, $val := .Values.myconfig}}
# {{ $key }}: |
# {{ $val }}
# {{- end }}
resulted in this error message:
$ helm install --dry-run --debug ./mychart/ --generate-name
install.go:159: [debug] Original chart version: ""
install.go:176: [debug] CHART PATH: /home/my/helmcharts/mychart
Error: YAML parse error on mychart/templates/myconfig.yaml: error converting YAML to JSON: yaml: line 11: could not find expected ':'
helm.go:84: [debug] error converting YAML to JSON: yaml: line 11: could not find expected ':'
YAML parse error on mychart/templates/myconfig.yaml
I can avoid the error by removing the | after myconfigfile1.properties: in my values.yaml, however then I lose the line breaks and the result is not what I want.
Many thanks for your help in advance.
Kind regards,
Martin
A few minutes after writing this question I stubled upon Question #62432632 convert-a-yaml-to-string-in-helm which does not exactly answer my question but with its help I could find the correct syntax.
values.yaml:
myconfig:
myconfigfile1.properties: |-
property11 = value11
property12 = value12
myconfigfile2.properties: |-
property21 = value21
property22 = value22
template:
apiVersion: v1
kind: ConfigMap
metadata:
name: myconfigmap
data:
{{- range $name, $config := .Values.myconfig }}
{{ $name }}: |-
{{ tpl $config $ | indent 4 }}
{{- end }}
The task is to range over workers collection and if the current worker has autoscaling.enabled=true create an hpa for it.
I've tried to compare .autoscaling.enabled to "true" but it returned "error calling eq: incompatible types for comparison". Here people say that it actually means that .autoscaling.enabled is nil. So {{ if .autoscaling.enabled }} somehow doesn't see the variable and assumes it doesn't exist.
Values:
...
workers:
- name: worker1
command: somecommand1
memoryRequest: 500Mi
memoryLimit: 1400Mi
cpuRequest: 50m
cpuLimit: 150m
autoscaling:
enabled: false
- name: worker2
command: somecommand2
memoryRequest: 512Mi
memoryLimit: 1300Mi
cpuRequest: 50m
cpuLimit: 150m
autoscaling:
enabled: false
- name: workerWithAutoscaling
command: somecommand3
memoryRequest: 600Mi
memoryLimit: 2048Mi
cpuRequest: 150m
cpuLimit: 400m
autoscaling:
enabled: true
minReplicas: 1
maxReplicas: 5
targetCPUUtilization: 50
targetMemoryUtilization: 50
...
template:
...
{{- range .Values.workers }}
{{- if .autoscaling.enabled }}
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
labels:
...
name: "hpa-{{ .name }}-{{ $.Realeas.Name }}"
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ .name }}
minReplicas: {{ .minReplicas }}
maxReplicas: {{ .maxReplicas }}
metrics:
{{- with .targetCPUUtilization}}
- type: Resource
resource:
name: cpu
targetAverageUtilization: {{ . }}
{{- end }}
{{- with .targetMemoryUtilization}}
- type: Resource
resource:
name: memory
targetAverageUtilization: {{ . }}
{{- end }}
---
{{- end }}
{{- end }}
I expect the manifest for one hpa that targets workerWithAutoscaling, but the actual output is totally empty.
Your use of {{- range .Values.workers }} and {{- if .autoscaling.enabled }} is fine. You are not getting any values because .minReplicas, .maxReplicas, etc, are inside .autoscaling scope.
See Modifying scope using with
Adding {{- with .autoscaling}} will solve the issue.
{{- range .Values.workers }}
{{- if .autoscaling.enabled }}
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
labels:
name: "hpa-{{ .name }}-{{ $.Release.Name }}"
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ .name }}
{{- with .autoscaling}}
minReplicas: {{ .minReplicas }}
maxReplicas: {{ .maxReplicas }}
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: {{ .targetCPUUtilization}}
- type: Resource
resource:
name: memory
targetAverageUtilization: {{ .targetMemoryUtilization}}
{{- end }}
{{- end }}
{{- end }}
helm template .
---
# Source: templates/hpa.yaml
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
labels:
name: "hpa-workerWithAutoscaling-release-name"
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: workerWithAutoscaling
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 50
- type: Resource
resource:
name: memory
targetAverageUtilization: 50
I am trying to write a CronJob for executing a shell script within a ConfigMap for Kafka.
My intention is to reassign partitions at specific intervals of time.
However, I am facing issues with it. I am very new to it. Any help would be appreciated.
cron-job.yaml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: partition-cron
spec:
schedule: "*/10 * * * *"
startingDeadlineSeconds: 20
successfulJobsHistoryLimit: 5
jobTemplate:
spec:
completions: 2
template:
spec:
containers:
- name: partition-reassignment
image: busybox
command: ["/configmap/runtimeConfig.sh"]
volumeMounts:
- name: configmap
mountPath: /configmap
restartPolicy: Never
volumes:
- name: configmap
configMap:
name: configmap-config
configmap-config.yaml
{{- if .Values.topics -}}
{{- $zk := include "zookeeper.url" . -}}
apiVersion: v1
kind: ConfigMap
metadata:
labels:
app: {{ template "kafka.fullname" . }}
chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
heritage: "{{ .Release.Service }}"
release: "{{ .Release.Name }}"
name: {{ template "kafka.fullname" . }}-config
data:
runtimeConfig.sh: |
#!/bin/bash
set -e
cd /usr/bin
until kafka-configs --zookeeper {{ $zk }} --entity-type topics --describe || (( count++ >= 6 ))
do
echo "Waiting for Zookeeper..."
sleep 20
done
until nc -z {{ template "kafka.fullname" . }} 9092 || (( retries++ >= 6 ))
do
echo "Waiting for Kafka..."
sleep 20
done
echo "Applying runtime configuration using {{ .Values.image }}:{{ .Values.imageTag }}"
{{- range $n, $topic := .Values.topics }}
{{- if and $topic.partitions $topic.replicationFactor $topic.reassignPartitions }}
cat << EOF > {{ $topic.name }}-increase-replication-factor.json
{"version":1, "partitions":[
{{- $partitions := (int $topic.partitions) }}
{{- $replicas := (int $topic.replicationFactor) }}
{{- range $i := until $partitions }}
{"topic":"{{ $topic.name }}","partition":{{ $i }},"replicas":[{{- range $j := until $replicas }}{{ $j }}{{- if ne $j (sub $replicas 1) }},{{- end }}{{- end }}]}{{- if ne $i (sub $partitions 1) }},{{- end }}
{{- end }}
]}
EOF
kafka-reassign-partitions --zookeeper {{ $zk }} --reassignment-json-file {{ $topic.name }}-increase-replication-factor.json --execute
kafka-reassign-partitions --zookeeper {{ $zk }} --reassignment-json-file {{ $topic.name }}-increase-replication-factor.json --verify
{{- end }}
{{- end -}}
My intention is to run the runtimeConfig.sh script as a cron job at regular intervals for partition reassignment in Kafka.
I am not sure if my approach is correct.
Also, I have randomly put image: busybox in the cron-job.yaml file. I am not sure about what should I be putting in there.
Information Part
$ kubectl get cronjobs
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
partition-cron */10 * * * * False 1 5m 12m
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
elegant-hedgehog-metrics-server-58995fcf8b-2vzg6 1/1 Running 0 5d
my-kafka-0 1/1 Running 1 12m
my-kafka-1 1/1 Running 0 10m
my-kafka-2 1/1 Running 0 9m
my-kafka-config-644f815a-pbpl8 0/1 Completed 0 12m
my-kafka-zookeeper-0 1/1 Running 0 12m
partition-cron-1548672000-w728w 0/1 ContainerCreating 0 5m
$ kubectl logs partition-cron-1548672000-w728w
Error from server (BadRequest): container "partition-reassignment" in pod "partition-cron-1548672000-w728w" is waiting to start: ContainerCreating
Modified Cron Job YAML
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: partition-cron
spec:
schedule: "*/5 * * * *"
startingDeadlineSeconds: 20
successfulJobsHistoryLimit: 5
jobTemplate:
spec:
completions: 1
template:
spec:
containers:
- name: partition-reassignment
image: busybox
command: ["/configmap/runtimeConfig.sh"]
volumeMounts:
- name: configmap
mountPath: /configmap
restartPolicy: Never
volumes:
- name: configmap
configMap:
name: {{ template "kafka.fullname" . }}-config
Now, I am getting Status of Cron Job pods as ContainerCannotRun.
You've set the ConfigMap to name: {{ template "kafka.fullname" . }}-config but in the job you are mounting configmap-config. Unless you installed the Helm chart using configmap as the name of the release, that Job will never start.
One way to fix it would be to define the volume as:
volumes:
- name: configmap
configMap:
name: {{ template "kafka.fullname" . }}-config