I'm trying to template k8s deployment file with helm.
I have these default values:
celery:
name: celery
nodeType: onDemand
workers:
- name: general
replicas: 1
threads: 4
Trying to override it with values file, with no luck.
I tried:
celery:
workers[0]:
replicas: 770
celery:
workers[1]:
replicas: 770
celery:
workers[general]:
replicas: 770
I don't want to override the entire list.
I solved it finally by converting it to dict:
celery:
name: celery
nodeType: onDemand
workers:
general:
replicas: 1
threads: 4
Related
I am trying to execute a pre install job using helm charts. Can someone help getting result of command (parameter in yaml file) that I put in the below file:
apiVersion: batch/v1
kind: Job
metadata:
name: pre-install-job
annotations:
"helm.sh/hook": "pre-install"
spec:
template:
spec:
containers:
- name: pre-install
image: busybox
imagePullPolicy: IfNotPresent
command: ['sh', '-c', 'touch somefile.txt && echo $PWD && sleep 15']
restartPolicy: OnFailure
terminationGracePeriodSeconds: 0
backoffLimit: 3
completions: 1
parallelism: 1
I want to know where somefile.txt is created and echo is printed. And the reason I know it is working because "sleep 15" works. I see a 15 second difference in start and end time of pod creation.
Any file you create in a container environment is created inside the container filesystem. Unless you've mounted some storage into the container, the file will be lost as soon as the container exits.
Anything a Kubernetes process writes to its stdout will be captured by the Kubernetes log system. You can retrieve it using kubectl logs pre-install-job-... -c pre-install.
I'm learning Helm to setup my 3 AWS EKS clusters - sandbox, staging, and production.
How can I set up my templates so some values are derived based on which cluster the chart is being installed at? For example, in my myapp/templates/deployment.yaml I may want
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }}
{{- end }}
I may want replicas to be either 1, 2, or 4 depending if I'm installing the chart in my sandbox, staging, or production cluster respectively? I wanna do same trick for cpu and memory requests and limits for my pods for example.
I was thinking of having something like this in my values.yaml file
environments:
- sandbox
- staging
- production
perClusterValues:
replicas:
- 1
- 2
- 4
cpu:
requests:
- 256m
- 512m
- 1024m
limits:
- 512m
- 1024m
- 2048m
memory:
requests:
- 1024Mi
- 1024Mi
- 2048Mi
limits:
- 2048Mi
- 2048Mi
- 3072Mi
So if I install a helm chart in the sandbox environment, I want to be able to do
$ helm install myapp myapp --set environment=sandbox
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
{{- if not .Values.autoscaling.enabled }}
# In pseudo-code, in my YAML files
# Get the index value from .Values.environments list
# based on pass-in environment parameter
{{ $myIndex = indexOf .Values.environments .Value.environment }}
replicas: {{ .Values.perClusterValues.replicas $myIndex }}
{{- end }}
I hope you understand my logic, but what is the correct syntax? Or is this even a good approach?
You can use the helm install -f option to pass an extra YAML values file in, and this takes precedence over the chart's own values.yaml file. So using exactly the template structure you already have, you can provide alternate values files
# sandbox.yaml
autoscaling:
enabled: false
replicaCount: 1
# production.yaml
autoscaling:
enabled: true
replicaCount: 5
And then when you go to deploy the chart, run it with
helm install myapp . -f production.yaml
(You can also helm install --set replicaCount=3 to override specific values, but the --set syntax is finicky and unusual; using a separate YAML file per environment is probably easier. Some tooling might be able to take advantage of JSON files also being valid YAML to write out additional deploy-time customizations.)
We are running jmx through Tauras using 2 containers in Kubernetes.
We are seeing only 50 users in results instead of 100(50*2 containers).
Can anyone please through some light if we are missing something here.
We get two jtl and checking them individual or combined the total users are same 50 only. Is it related to same Thread name being generated and logged in jtl file or something else.
Here is the yml details:
apiVersion: v1
kind: ConfigMap
metadata:
name: joba
namespace: AAA
data:
protocol: "https"
serverUrl: “testurl”
users: "50”
duration: "1m"
nodeName: "Nodename"
---
apiVersion: /v1
kind: Job
metadata:
name: perftest
namespace: dev
spec:
template:
spec:
containers:
- args: ["split -l ${users} --numeric-suffixes Test.csv Test-; /bin/bash ./Shellscripttoread_assignvariables.sh;"]
command: ["/bin/bash", "-c"]
env:
- name: JobNumber
value: "00"
envFrom:
- configMapRef:
name: job-multi
image: imagepath
name: ubuntu-00
resources:
limits:
memory: “8000Mi"
cpu: "2880m"
- args: ["split -l ${users} --numeric-suffixes Test.csv Test-; /bin/bash ./Shellscripttoread_assignvariables.sh;"]
command: ["/bin/bash", "-c"]
env:
- name: JobNumber
value: "01”
envFrom:
- configMapRef:
name: job-multi
image: imagepath
name: ubuntu-01
resources:
limits:
memory: “8000Mi"
cpu: "2880m"
Your YAML is very nice but it doesn't tell anything about how do you launch JMeter or what these shell scripts you invoke are doing.
If you just kick off 2 separate JMeter instances by means of k8s - JMeter will look at the number of active threads from the .jtl file and given the Sampler/Transaction names are the same JMeter "thinks" that the tests were executed on one engine.
The workaround is to add i.e. machineName() or __machineIP() function to sampler/transaction labels, this way JMeter will distinguish the results coming from different instances and you will see real number of active threads.
The solution would be running your JMeter test in Distributed Mode so master will run in one pod, slaves in their own pods and the master will be responsible for transferring .jmx script to the slaves and collecting results from them
I want a job to trigger every 15 minutes but it is consistently triggering every 30 minutes.
UPDATE:
I've simplified the problem by just running:
kubectl run hello --schedule="*/1 * * * *" --restart=OnFailure --image=busybox -- /bin/sh -c "date; echo Hello from the Kubernetes cluster"
As specified in the docs here: https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/
and yet the job still refuses to run on time.
$ kubectl get cronjobs
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
hello */1 * * * * False 1 5m 30m
hello2 */1 * * * * False 1 5m 12m
It took 25 minutes for the command line created cronjob to run and 7 minutes for the cronjob created from yaml. They were both finally scheduled at the same time so it's almost like etcd finally woke up and did something?
ORIGINAL ISSUE:
When I drill into an active job I see Status: Terminated: Completed but
Age: 25 minutes or something greater than 15.
In the logs I see that the python script meant to run has completed it's final print statement. The script takes about ~2min to complete based on it's output file in s3. Then no new job is scheduled for 28 more minutes.
I have tried with different configurations:
Schedule: */15 * * * * AND Schedule: 0,15,30,45 * * * *
As well as
Concurrency Policy: Forbid AND Concurrency Policy: Replace
What else could be going wrong here?
Full config with identifying lines modified:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
labels:
type: f-c
name: f-c-p
namespace: extract
spec:
concurrencyPolicy: Forbid
failedJobsHistoryLimit: 1
jobTemplate:
metadata:
creationTimestamp: null
spec:
template:
metadata:
creationTimestamp: null
labels:
type: f-c
spec:
containers:
- args:
- /f_c.sh
image: identifier.amazonaws.com/extract_transform:latest
imagePullPolicy: Always
env:
- name: ENV
value: prod
- name: SLACK_TOKEN
valueFrom:
secretKeyRef:
key: slack_token
name: api-tokens
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
key: aws_access_key_id
name: api-tokens
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
key: aws_secret_access_key
name: api-tokens
- name: F_ACCESS_TOKEN
valueFrom:
secretKeyRef:
key: f_access_token
name: api-tokens
name: s-f-c
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
schedule: '*/15 * * * *'
successfulJobsHistoryLimit: 1
suspend: false
status: {}
After running these jobs in a test cluster I discovered that external circumstances prevented them from running as intended.
On the original cluster there were ~20k scheduled jobs. The built-in scheduler for Kubernetes is not yet capable of handling this volume consistently.
The maximum number of jobs that can be reliably run (within a minute of the time intended) may depend on the size of your master nodes.
Isn't that by design?
A cron job creates a job object about once per execution time of its schedule. We say “about” because there are certain circumstances where two jobs might be created, or no job might be created. We attempt to make these rare, but do not completely prevent them. Therefore, jobs should be idempotent.
Ref. https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#cron-job-limitations
I'm trying to run confluent kafka image in kubernetes environment & facing
FATAL [KafkaServer id=0] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
kafka.common.KafkaException: Found directory /var/lib/kafka/data, 'data' is not in the form of topic-partition or topic-partition.uniqueId-delete (if marked for deletion).
Kafka's log directories (and children) should only contain Kafka topic data.
My deployment config:
apiVersion: apps/v1beta2 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
name: kafka-confluent
labels:
app: kafka-confluent
spec:
replicas: 1
selector:
matchLabels:
app: kafka-confluent
template:
metadata:
labels:
app: kafka-confluent
spec:
containers:
- name: zookeeper-kafka
image: zookeeper:3.5
ports:
- containerPort: 2181
- name: kafka-confluent
image: confluentinc/cp-kafka:4.0.0
ports:
- containerPort: 9092
command:
- sh
- -c
- "exec kafka-server-start /etc/kafka/server.properties \
--override reserved.broker.max.id=2147483647 \
--override zookeeper.connect=localhost:2181 \
--override listeners=PLAINTEXT://:9092 \
"
To solve this I've tried to mount some ephemeral volume like this.
volumes:
- name: kafka-data
emptyDir: {}
...
volumeMounts:
- mountPath: /var/lib/kafka/data
name: kafka-data
And clear the data dir with init container:
containers:
- name: cleaner
image: busybox
command: ['rm', '-rf', '/var/lib/kafka/data/*']
Both tries failed with the same result.
Also If I run the image & list the data /var/lib/kafka/data/
Looks like the directory is empty.
docker run --rm -it confluentinc/cp-kafka:4.0.0 bash
root#35087653f43a:/# ls /var/lib/kafka/data/ -al
total 8
drwxrwxrwx 2 root root 4096 Apr 16 09:59 .
drwxr-xr-x 3 root root 4096 Jan 3 19:20 ..
is it your error?
Fix add this:
log.dirs=/var/lib/kafka/data
in server.properties
I have this working configuration:
volumeMounts:
- name: data
mountPath: /var/lib/kafka
and in the command, override the log.dirs
kafka-server-start /etc/kafka/server.properties \
--override log.dirs=/var/lib/kafka
in my windows system I added this
log.dirs=F:\softwares\kafka_2.12-2.5.1\data
server.properties
where
F:\softwares\kafka_2.12-2.5.1
is my kafka directory and issue resolved
here I created data folder inside kafka directory, it was not there when I extracted kafka from 7zip tool.