How to run command after initialization - kubernetes

I would like to run specific command after initialization of deployment is successful.
This is my yaml file:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: auth
spec:
replicas: 1
template:
metadata:
labels:
app: auth
spec:
containers:
- name: auth
image: {{my-service-image}}
env:
- name: NODE_ENV
value: "docker-dev"
resources:
requests:
cpu: 100m
memory: 100Mi
ports:
- containerPort: 3000
However, I would like to run command for db migration after (not before) deployment is successfully initialized and pods are running.
I can do it manually for every pod (with kubectl exec), but this is not very scalable.

I resolved it using lifecycles:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: auth
spec:
replicas: 1
template:
metadata:
labels:
app: auth
spec:
containers:
- name: auth
image: {{my-service-image}}
env:
- name: NODE_ENV
value: "docker-dev"
resources:
requests:
cpu: 100m
memory: 100Mi
ports:
- containerPort: 3000
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", {{cmd}}]

You can use Helm to deploy a set of Kubernetes resources. And then, use a Helm hook, e.g. post-install or post-upgrade, to run a Job in a separate docker container. Set your Job to invoke db migration. A Job will run >=1 Pods to completion, so it fits here quite well.

I chose to use a readinessProbe
My application requires configuration after the process has completely started.
The postStart command was running before the app was ready.
readinessProbe:
exec:
command: [healthcheck]
initialDelaySeconds: 30
periodSeconds: 2
timeoutSeconds: 1
successThreshold: 3
failureThreshold: 10

Related

GCP Firestore: Server request fails with Missing or insufficient permissions from GKE

I am trying to connect to Firestore from code running on GKE Container. Simple REST GET api is working fine, but when I access the Firestore from read/write, I am getting Missing or insufficient permissions.
An unhandled exception was thrown by the application.
Info
2021-06-06 21:21:20.283 EDT
Grpc.Core.RpcException: Status(StatusCode="PermissionDenied", Detail="Missing or insufficient permissions.", DebugException="Grpc.Core.Internal.CoreErrorDetailException: {"created":"#1623028880.278990566","description":"Error received from peer ipv4:172.217.193.95:443","file":"/var/local/git/grpc/src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Missing or insufficient permissions.","grpc_status":7}")
at Google.Api.Gax.Grpc.ApiCallRetryExtensions.<>c__DisplayClass0_0`2.<<WithRetry>b__0>d.MoveNext()
Update I am trying to provide secret to pod with service account credentails.
Here is the k8 file which deploys a pod to cluster with no issues when no secrets are provided and I can do Get Operations which don't hit Firestore, and they work fine.
kind: Deployment
apiVersion: apps/v1
metadata:
name: foo-worldmanagement-production
spec:
replicas: 1
selector:
matchLabels:
app: foo
role: worldmanagement
env: production
template:
metadata:
name: worldmanagement
labels:
app: foo
role: worldmanagement
env: production
spec:
containers:
- name: worldmanagement
image: gcr.io/foodev/foo/master/worldmanagement.21
resources:
limits:
memory: "500Mi"
cpu: "300m"
imagePullworld: Always
readinessProbe:
httpGet:
path: /api/worldManagement/policies
port: 80
ports:
- name: worldmgmt
containerPort: 80
Now, if I try to mount secret, the pod never gets created fully, and it eventually fails
kind: Deployment
apiVersion: apps/v1
metadata:
name: foo-worldmanagement-production
spec:
replicas: 1
selector:
matchLabels:
app: foo
role: worldmanagement
env: production
template:
metadata:
name: worldmanagement
labels:
app: foo
role: worldmanagement
env: production
spec:
volumes:
- name: google-cloud-key
secret:
secretName: firestore-key
containers:
- name: worldmanagement
image: gcr.io/foodev/foo/master/worldmanagement.21
volumeMounts:
- name: google-cloud-key
mountPath: /var/
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /var/key.json
resources:
limits:
memory: "500Mi"
cpu: "300m"
imagePullworld: Always
readinessProbe:
httpGet:
path: /api/worldManagement/earth
port: 80
ports:
- name: worldmgmt
containerPort: 80
I tried to deploy the sample application and it works fine.
If I keep only the following the yaml file, the container gets deployed properly
- name: google-cloud-key
secret:
secretName: firestore-key
But once I add the following to yaml, it fails
volumeMounts:
- name: google-cloud-key
mountPath: /var/
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /var/key.json
And I can see in GCP events that the container is not able to find the google-cloud-key. Any idea how to troubleshoot this issue, i.e why I am not able to mount the secrets, I can bash into the pod if needed.
I am using multi stage docker file made of
From mcr.microsoft.com/dotnet/sdk:5.0 AS build
FROM mcr.microsoft.com/dotnet/aspnet:5.0 AS runtime
Thanks
Looks like they key itself might not be correctly visible to the pod. I would start by getting into the pod with kubectl exec --stdin --tty <podname> -- /bin/bash and ensuring that the /var/key.json (per your config) is accessible and has the correct credentials.
The following would be a good way to mount the secret:
volumeMounts:
- name: google-cloud-key
mountPath: /var/run/secret/cloud.google.com
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /var/run/secret/cloud.google.com/key.json
The above assumes your secret was created with a command like:
kubectl --namespace <namespace> create secret generic firestore-key --from-file key.json
Also it is important to check your Workload Identity setup. The Workload Identity | Kubernetes Engine Documentation has a good section on this.

Is it possible to use a bash script to do the liveness test in pod?

I'm currently setting up a kubernetes cluster with 3 nodes on 3 differents vm and each node is composed of 1 pod witch run the following docker image: ethereum/client-go:stable
The problem is that I want to do a health check test using a bash script (because I have to test a lot of things) but I don't understand how I can export this file to each container that are deployed with my yaml deployment file.
I've tried to add wget command in the yaml file to download my health check script from my github repo but it wasn't very clean from my point of view, maybe there is an other way ?
My current deployment file:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: goerli
name: goerli-deploy
spec:
replicas: 3
selector:
matchLabels:
app: goerli
template:
metadata:
labels:
app: goerli
spec:
containers:
- image: ethereum/client-go:stable
name: goerli-geth
args: ["--goerli", "--datadir", "/test2"]
env:
- name: LASTBLOCK
value: "0"
- name: FAILCOUNTER
value: "0"
ports:
- containerPort: 30303
name: geth
livenessProbe:
exec:
command:
- /bin/sh
- /test/health.sh
initialDelaySeconds: 60
periodSeconds: 100
volumeMounts:
- name: test
mountPath: /test
restartPolicy: Always
volumes:
- name: test
hostPath:
path: /test
I expect to put health check script in /test/health.sh
Any ideas ?
This could be a perfect usecase for the init container, As there could be different images for the init container and the Application container thus they have different file system inside the pods, therefore we need to use Emptydir in order to share the state.
for further detail follow the link init-containers
Thanks to Suresh Vishnoi:
A way to resolve my problem is to use init container this way:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: goerli
name: goerli-deploy
spec:
replicas: 3
selector:
matchLabels:
app: goerli
template:
metadata:
labels:
app: goerli
spec:
containers:
- image: ethereum/client-go:stable
name: goerli-geth
args: ["--goerli", "--datadir", "/test2"]
env:
- name: LASTBLOCK
value: "0"
- name: FAILCOUNTER
value: "0"
ports:
- containerPort: 30303
name: geth
livenessProbe:
exec:
command:
- /bin/sh
- /test/health.sh
initialDelaySeconds: 60
periodSeconds: 100
volumeMounts:
- name: test
mountPath: /test
initContainers:
- name: healthcheck
image: ethereum/client-go:stable
command: ["wget", "-O", "/test2/health.sh", "https://My-script-bash"]
volumeMounts:
- name: test
mountPath: "/test"
restartPolicy: Always
volumes:
- name: test
emptyDir: {}
The downloaded file will be visible in /test/health.sh
If you're using helm look at chart tests: https://github.com/helm/helm/blob/master/docs/chart_tests.md. This covers readinessProbe tho, not liveness.
For advanced liveness probe, I'd run some kind of healthcheck sidecar which does all the advanced tests continiosly via localhost, and exposes a single /healthcheck endpoint. Then use the endpoint in a liveness probe.

Running mongodb stateful set on Kubernetes with istio

I am trying to setup mongodb on kubernetes with istio. My statefulset is as follows:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: treeservice
namespace: staging
spec:
serviceName: tree-service-service
replicas: 1
selector:
matchLabels:
app: treeservice
template:
metadata:
labels:
app: treeservice
spec:
containers:
- name: mongodb-cache
image: mongo:latest
imagePullPolicy: Always
ports:
- containerPort: 30010
volumeMounts:
- name: mongodb-cache-data
mountPath: /data/db
resources:
requests:
memory: "4Gi" # 4 GB
cpu: "1000m" # 1 CPUs
limits:
memory: "4Gi" # 4 GB
cpu: "1000" # 1 CPUs
readinessProbe:
exec:
command:
- mongo
- --eval "db.stats()" --port 30010
initialDelaySeconds: 60 #wait this period after staring fist time
periodSeconds: 30 # polling interval every 5 minutes
timeoutSeconds: 60
livenessProbe:
exec:
command:
- mongo
- --eval "db.stats()" --port 30010
initialDelaySeconds: 60 #wait this period after staring fist time
periodSeconds: 30 # polling interval every 5 minutes
timeoutSeconds: 60
command: ["/bin/bash"]
args: ["-c","mongod --port 30010 --replSet test"] #bind to localhost
volumeClaimTemplates:
- metadata:
name: mongodb-cache-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: fast
resources:
requests:
storage: 300Gi
however, the pod is not created and I see the following error:
kubectl describe statefulset treeservice -n staging
Warning FailedCreate 1m (x159 over 1h) statefulset-controller create Pod treeservice-0 in StatefulSet treeservice failed error: Pod "treeservice-0" is invalid: spec.containers[1].env[7].name: Invalid value: "ISTIO_META_statefulset.kubernetes.io/pod-name": a valid environment variable name must consist of alphabetic characters, digits, '_', '-', or '.', and must not start with a digit (e.g. 'my.env-name', or 'MY_ENV.NAME', or 'MyEnvName1', regex used for validation is '[-._a-zA-Z][-._a-zA-Z0-9]*')
I assum treeservice is a valid pod name. Am I missing something?
I guess it's due to this issue https://github.com/istio/istio/issues/9571 which is still open
I made it work temporarily using the following:
annotations:
sidecar.istio.io/inject: "false"

PostStart hook seems to not work even though there is no failure

I have a pod running a cassandra container. I want to create a keyspace once the container starts. I tried using the postStart hook. For some reason it does not fail but the keyspace does not get created. But I tried the same command in the readinessProbe as a hack and it worked fine. Can someone help me understand what's wrong with my configuration. Thanks in advance
apiVersion: v1
kind: PersistentVolume
metadata:
name: local-cass-volume-1
labels:
type: local
app: test
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /tmp/data/cass-volume-1
---
apiVersion: v1
kind: Service
metadata:
name: test-cassandra
labels:
app: test
spec:
ports:
- port: 9042
selector:
app: test
tier: cass
clusterIP: None
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cass-pv-claim
labels:
app: test
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: test-cassandra
labels:
app: test
spec:
strategy:
type: Recreate
template:
metadata:
labels:
app: test
tier: cass
spec:
containers:
- image: cassandra:latest
name: cassandra
imagePullPolicy: IfNotPresent
ports:
- containerPort: 9042
name: cass-port
volumeMounts:
- name: cass-persistent-storage
mountPath: /var/lib/cassandra
readinessProbe:
exec:
command: ["cqlsh", "-e", "CREATE KEYSPACE IF NOT EXISTS test1234 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true;"]
initialDelaySeconds: 30
periodSeconds: 30
timeoutSeconds: 10
failureThreshold: 5
lifecycle:
postStart:
exec:
command: ["/bin/bash", "-c", "until echo $'CREATE KEYSPACE IF NOT EXISTS test908 WITH replication = {\'class\': \'SimpleStrategy\', \'replication_factor\': \'1\'} AND durable_writes = true;' | cqlsh ; do echo boo; sleep 2; done"]
volumes:
- name: cass-persistent-storage
persistentVolumeClaim:
claimName: cass-pv-claim
I had the same issue of postStart hook commands not executing. After went through this tried readinessProbe trick and it worked for me as well. The issue was, I am executing curl commands to which the same service should reply back. even though its postStart hook application needed a few more seconds to reply back. adding sleep command before executing curl commands solved my issue.

Failed to attach volume ... already being used by

I am running Kubernetes in a GKE cluster using version 1.6.6 and another cluster with 1.6.4. Both are experiencing issues with failing over GCE compute disks.
I have been simulating failures using kill 1 inside the container or killing the GCE node directly. Sometimes I get lucky and the pod will get created on the same node again. But most of the time this isn't the case.
Looking at the event log it shows the error trying to mount 3 times and it fails to do anything more. Without human intervention it never corrects it self. I am forced to kill the pod multiple times until it works. During maintenances this is a giant pain.
How do I get Kubernetes to fail over with volumes properly ? Is there a way to tell the deployment to try a new node on failure ? Is there a way to remove the 3 retry limit ?
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: dev-postgres
namespace: jolene
spec:
revisionHistoryLimit: 0
template:
metadata:
labels:
app: dev-postgres
namespace: jolene
spec:
containers:
- image: postgres:9.6-alpine
imagePullPolicy: IfNotPresent
name: dev-postgres
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgres-data
env:
[ ** Removed, irrelevant environment variables ** ]
ports:
- containerPort: 5432
livenessProbe:
exec:
command:
- sh
- -c
- exec pg_isready
initialDelaySeconds: 30
timeoutSeconds: 5
failureThreshold: 6
readinessProbe:
exec:
command:
- sh
- -c
- exec pg_isready --host $POD_IP
initialDelaySeconds: 5
timeoutSeconds: 3
periodSeconds: 5
volumes:
- name: postgres-data
persistentVolumeClaim:
claimName: dev-jolene-postgres
I have tried this with and without PersistentVolume / PersistentVolumeClaim.
apiVersion: "v1"
kind: "PersistentVolume"
metadata:
name: dev-jolene-postgres
spec:
capacity:
storage: "1Gi"
accessModes:
- "ReadWriteOnce"
claimRef:
namespace: jolene
name: dev-jolene-postgres
gcePersistentDisk:
fsType: "ext4"
pdName: "dev-jolene-postgres"
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dev-jolene-postgres
namespace: jolene
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
By default, every node is schedulable, so there is no need to explicitly mention it in deployment. and feature which can mention retry limits is still in progress, which can be tracked here, https://github.com/kubernetes/kubernetes/issues/16652