I am trying to develop a Helm chart for an application to ease release management and deployment of the application to kubernetes. In order to do so, i have written a pre-install hook in the Helm chart.
apiVersion: batch/v1
kind: Job
metadata:
name: px-etcd-preinstall-hook
labels:
heritage: {{.Release.Service | quote }}
release: {{.Release.Name | quote }}
chart: "{{.Chart.Name}}-{{.Chart.Version}}"
annotations:
"helm.sh/hook": pre-install
"helm.sh/hook-weight": "-5"
"helm.sh/hook-delete-policy": hook-succeeded, hook-failed
spec:
backoffLimit: 2
template:
spec:
restartPolicy: Never
containers:
- name: pre-install-job
imagePullPolicy: Always
image: "hrishi/px-etcd-preinstall-hook:v1"
command: ['/bin/sh']
args: ['/usr/bin/etcdStatus.sh',"{{ .Values.etcdEndPoint }}"]
This docker container just checks if an ETCD endpoint is accessible or not. Idea is for it to wait for a few seconds and a few tries and then exit.
Here is the initial shell script which runs as part of this container.
set -x
echo "Initializing..."
svcname=$1
echo $svcname
etcdURL=$(echo "$svcname" | awk -F: '{ st = index($0,":");print substr($0,st+1)}')
echo $etcdURL
response=$(curl --write-out %{http_code} --silent --output /dev/null "$etcdURL/version")
echo $response
if [[ "$response" != 200 ]]
then
echo "Provided etcd url is not reachable. Exiting.."
exit 1
fi
All is well and fine if the ETCD url is accessible, but if the etcd url is inaccessible then I get an error stating Error: Job failed: BackoffLimitExceeded”
I want to check if there is a way of setting a user friendly error message stating that the url isnt accessible or something like that.
Seems there isnt a way to do it right now, not that i know of. I tried this to just be a Pod instead of a Job and that doesnt work either.
Looked up at the docs for Helm but couldnt seem to find any information regarding this.
I don't think is possible. But I'd take a different approach.
If your application requires ETCD, why don't you check if ETCD is accesible as one of your Pod probes, like liveness or readiness? That way, if there is no connectivity between your application and ETCD, your application won't start and you'll know that the probe failed when describing your Pod, in a more kubernetes way.
Furthermore, you can even make helm install to wait until all the Pod's are Ready, meaning that the command helm install would fail if your application didn't connect to ETCD.
Related
I have a k8s cronjob run my docker image transaction-service.
It starts and gets its job done successfully. When it's over, I expect the pod to terminate but... istio-proxy still lingers there:
And that results in:
Nothing too crazy, but I'd like to fix it.
I know I should call curl -X POST http://localhost:15000/quitquitquit
But I don't know where and how. I need to call that quitquitquit URL only when transaction-service is in a completed state. I read about preStop lifecycle hook, but I think I need more of a postStop one. Any suggestions?
You have a few options here:
On your job/cronjob spec, add the following lines and your job immediately after:
command: ["/bin/bash", "-c"]
args:
- |
trap "curl --max-time 2 -s -f -XPOST http://127.0.0.1:15020/quitquitquit" EXIT
while ! curl -s -f http://127.0.0.1:15020/healthz/ready; do sleep 1; done
echo "Ready!"
< your job >
Disable Istio injection at the Pod level in your Job/Cronjob definition:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
...
spec:
...
jobTemplate:
spec:
template:
metadata:
annotations:
# disable istio on the pod due to this issue:
# https://github.com/istio/istio/issues/11659
sidecar.istio.io/inject: "false"
Note: The annotation should be on the Pod's template, not on the Job's template.
You can use TTL mechanism for finished Jobs mentioned in kubernetes doc which help removing the whole pod.
In my Dockerfile I put
ADD ./entrypoint.sh /entrypoint.sh
RUN ["chmod", "+x", "/entrypoint.sh"]
RUN apk --no-cache add curl
ENTRYPOINT ["/entrypoint.sh"]
My entrypoint.sh looks like this:
#!/bin/sh
/app/myapp && curl -X POST http://localhost:15000/quitquitquit
It works.
I'm running a pod with 3 containers (telegraf, fluentd and an in-house agent) that makes use of shareProcessNamespace: true.
I've written a python script to fetch the initial config for telegraf and fluentd from a central controller API endpoint. Since this is a one time operation, I plan to use helm post-install hook.
apiVersion: batch/v1
kind: Job
metadata:
name: agent-postinstall
annotations:
"helm.sh/hook-weight": "3"
"helm.sh/hook": "post-install"
spec:
template:
spec:
containers:
- name: agent-postinstall
image: "{{ .Values.image.agent.repository }}:{{ .Values.image.agent.tag | default .Chart.AppVersion }}"
imagePullPolicy: IfNotPresent
command: ['python3', 'getBaseCfg.py']
volumeMounts:
- name: config-agent-volume
mountPath: /etc/config
volumes:
- name: config-agent-volume
configMap:
name: agent-cm
restartPolicy: Never
backoffLimit: 1
It is required for the python script to check if telegraf/fluentd/agent processes are up, before getting the config. I intend to wait (with a timeout) until pgrep <telegraf/fluentd/agent> returns true and then fire APIs. Is there a way to enable shareProcessNamespace for the post-install hook as well? Thanks.
PS: Currently, the agent calls the python script along with its own startup script. It works, but it is kludgy. I'd like to move it out of agent container.
shareProcessNamespace
Most important part of this flag is it works only within one pod, all containers within one pod will share processes between each other.
In described approach job is supposed to be used. Job creates a separate pod so it won't work this way. Container should be a part of the "main" pod with all other containers to have access to running processes of that pod.
More details about process sharing.
Possible way to solution it
It's possible to get processes from the containers directly using kubectl command.
Below is an example how to check state of the processes using pgrep command. The pgrepContainer container needs to have the pgrep command already installed.
job.yaml:
apiVersion: batch/v1
kind: Job
metadata:
name: "{{ .Release.Name }}-postinstall-hook"
annotations: "helm.sh/hook": post-install
spec:
template:
spec:
serviceAccountName: config-user # service account with appropriate permissions is required using this approach
volumes:
- name: check-script
configMap:
name: check-script
restartPolicy: Never
containers:
- name: post-install-job
image: "bitnami/kubectl" # using this image with kubectl so we can connect to the cluster
command: ["bash", "/mnt/script/checkScript.sh"]
volumeMounts:
- name: check-script
mountPath: /mnt/script
And configmap.yaml which contains script and logic which check three processes in loop for 60 iterations per 10 seconds each:
apiVersion: v1
kind: ConfigMap
metadata:
name: check-script
data:
checkScript.sh: |
#!/bin/bash
podName=test
pgrepContainer=app-1
process1=sleep
process2=pause
process3=postgres
attempts=0
until [ $attempts -eq 60 ]; do
kubectl exec ${podName} -c ${pgrepContainer} -- pgrep ${process1} 1>/dev/null 2>&1 \
&& kubectl exec ${podName} -c ${pgrepContainer} -- pgrep ${process2} 1>/dev/null 2>&1 \
&& kubectl exec ${podName} -c ${pgrepContainer} -- pgrep ${process3} 1>/dev/null 2>&1
if [ $? -eq 0 ]; then
break
fi
attempts=$((attempts + 1))
sleep 10
echo "Waiting for all containers to be ready...$[ ${attempts}*10 ] s"
done
if [ $attempts -eq 60 ]; then
echo "ERROR: Timeout"
exit 1
fi
echo "All containers are ready !"
echo "Configuring telegraf and fluentd services"
Final result will look like:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
test 2/2 Running 0 20m
test-postinstall-hook-dgrc9 0/1 Completed 0 20m
$ kubectl logs test-postinstall-hook-dgrc9
Waiting for all containers to be ready...10 s
All containers are ready !
Configuring telegraf and fluentd services
Above is an another approach, you can use its logic as base to achieve your end goal.
postStart
Also postStart hook can be considered to be used where some logic will be located. It will run after container is created. Since main application takes time to start and there's already logic which waits for it, it's not an issue that:
there is no guarantee that the hook will execute before the container ENTRYPOINT
i have a netcore webapi deployed on kubernetes. Every night at midnight i need to call an endpoint to do some operations on every pod, so i have deployed a cronjob that calls the api with curl and the method does the required operations.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: test-cronjob
spec:
schedule: "0 0 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: test-cronjob
image: curlimages/curl:7.74.0
imagePullPolicy: IfNotPresent
command:
- "/bin/sh"
- "-ec"
- |
date;
echo "doingOperation"
curl POST "serviceName/DailyTask"
restartPolicy: OnFailurey
But this only calls one pod, the one assigned by my ingress.
There is a way to call every pod contained in a service?
That is an expected behavior as when we do curl on a Kubernetes Service object, it is expected to pass the requests to only one of the endpoints (IP of the pods). To achieve, what you need, you need to write a custom script that first gets the endpoints associated with the services and then iteratively call curl over them one by one.
Note: The pods IP can be changed due to pod re-creation so you should fetch the endpoints associated with the service in each run of the cronjob.
You could run kubectl inside your job:
kubectl get pods -l mylabel=mylabelvalue \
-o go-template='{{range .items}}{{.status.podIP}}{{"\n"}}{{end}}'
This will return the internal IP of all the containers my the specific label.
You can then loop over the addresses and run your command.
Since enabling Pods with rights on the API-Server, e.g. to look at the actual service endpoints, is cumbersome and poses a security risk, I recommend a simple scripting solution here.
First up, install a headless service for the deployment in question (a service with clusterIP=None). This will make your internal DNS Server create several A records, each pointing at one of your Pod IPs.
Secondly, in order to ping each Pod in a round-robin fashion from your Cron-Job, employ a little shell script along the lines below (your will need to run this from a container with dig and curl installed on it):
dig +noall +answer <name-of-headless-service>.<namespace>.svc.cluster.local | awk -F$'\t' '{curl="curl <your-protocol>://"$2":<your-port>/<your-endpoint>"; print curl}' | source /dev/stdin
I am facing a weird behaviour with kubectl and --dry-run.
To simplify let's say that I have the following yaml file:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
run: nginx
name: nginx
spec:
replicas: 3
selector:
matchLabels:
run: nginx
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
run: nginx
spec:
containers:
- image: nginxsdf
imagePullPolicy: Always
name: nginx
Modifying for example the image or the number of replicas:
kubectl apply -f Deployment.yaml -o yaml --dry-run outputs me the resource having the OLD specifications
kubectl apply -f Deployment.yaml -o yaml outputs me the resource having NEW specifications
According to the documentation:
--dry-run=false: If true, only print the object that would be sent, without sending it.
However the object printed is the old one and not the one that will be sent to the ApiServer
Tested on minikube, gke v1.10.0
In the meanwhile I opened a new gitHub issue for it:
https://github.com/kubernetes/kubernetes/issues/72644
I got the following answer in the kubernetes issue page:
When updating existing objects, kubectl apply doesn't send an entire object, just a patch. It is not exactly correct to print either the existing object or the new object in dry-run mode... the outcome of the merge is what should be printed.
For kubectl to be able to accurately reflect the result of the apply, it would need to have the server-side apply logic clientside, which is a non-goal.
Current efforts are directed at moving apply logic to the server. As part of that, the ability to dry-run server-side has been added. kubectl apply --server-dry-run will do what you want, printing the result of the apply merge, without actually persisting it.
#apelisse we should probably update the flag help for apply and possibly print a warning when using --dry-run when updating an object via apply to document the limitations of --dry-run and direct people to use --server-dry-run
The latest version of the client uses:
kubectl apply -f Deployment.yaml --dry-run=server
When I try to create Deployment as Type Job, it's not pulling any image.
Below is .yaml:
apiVersion: batch/v1
kind: Job
metadata:
name: copyartifacts
spec:
backoffLimit: 1
template:
metadata:
name: copyartifacts
spec:
restartPolicy: "Never"
volumes:
- name: sharedvolume
persistentVolumeClaim:
claimName: shared-pvc
- name: dockersocket
hostPath:
path: /var/run/docker.sock
containers:
- name: copyartifacts
image: alpine:3.7
imagePullPolicy: Always
command: ["sh", "-c", "ls -l /shared; rm -rf /shared/*; ls -l /shared; while [ ! -d /shared/artifacts ]; do echo Waiting for artifacts to be copied; sleep 2; done; sleep 10; ls -l /shared/artifacts; "]
volumeMounts:
- mountPath: /shared
name: sharedvolume
Can you please guide here?
Regards,
Vikas
There could be two possible reasons for not seeing pod.
The pod hasn't been created yet.
The pod has completed it's task and terminated before you have noticed.
1. Pod hasn't been created:
If pod hasn't been created yet, you have to find out why the job failed to create pod. You can view job's events to see if there are any failure event. Use following command to describe a job.
kubectl describe job <job-name> -n <namespace>
Then, check the Events: field. There might be some events showing pod creation failure with respective reason.
2. Pod has completed and terminated:
Job's are used to perform one-time task rather than serving an application that require to maintain a desired state. When the task is complete, pod goes to completed state then terminate (but not deleted). If your Job is intended for a task that does not take much time, the pod may terminate after completing the task before you have noticed.
As the pod is terminated, kubectl get pods will not show that pod. However, you will able to see the pod using kubectl get pods -a command as it hasn't been deleted.
You can also describe the job and check for completion or success event.
if you use kind created the K8s cluster, all the cluster node run as docker. If you had reboot you computer or VM, the cluster (pod) ip address may change, leeding to the cluster node internet communication failed. In this case, see the cluster manager logs, it has error message. Job created, but pod not.
try to re-create the cluster, or change the node config about ip address.