ArangoDB init container fails on minikube - kubernetes

I'm working on a NodeJS service which uses ArangoDB as datastore, and deployed on minikube. I use an initContainer directive in the kubernetes deployment manifest to ensure that the database is ready to receive connections before the application attempts to connect. The relevant portion of the kubernetes YAML is shown below:
apiVersion: apps/v1
kind: Deployment
metadata:
name: carservice
template:
spec:
initContainers:
- name: init-carservice
image: arangodb/arangodb:3.5.1
command: ['sh', 'c', 'arangosh --server.endpoint="https://${CARSERVICE_CARSERVICEDB_SERVICE_HOST}:${CARSERVICE_CARSERVICEDB_SERVICE_PORT}" --server.password=""; do echo waiting for database to be up; sleep 2; done;']
containers:
- name: carservice
image: carservice
imagePullPolicy: IfNotPresent
The challenge has been that sometimes the initContainer is able to wait for the database connection to be established successfully. Most of the other times, it randomly fails with the error:
ERROR caught exception: invalid endpoint spec: https://
Out of desperation, I changed the scheme to http, and it fails with a corresponding error:
ERROR caught exception: invalid endpoint spec: http://
My understanding of these errors is that the database is not able to recognize https and http in these instances, which is strange. The few times the initContainer bit worked successfully, I used https in the related command in the kubernetes spec.
I must add that the actual database (https://${CARSERVICE_CARSERVICEDB_SERVICE_HOST}:${CARSERVICE_CARSERVICEDB_SERVICE_PORT}) has been successfully deployed to minikube using kube-arangodb, and can be accessed through the web UI, so that bit is sorted.
What I'd like to know:
Is this the recommended way to wait for ArangoDB to connect using the initContainer directive, or do I have to use an entirely different approach?
What could be causing the error I'm getting? Am I missing something fundamental here?
Would be glad for any help.

The issue was that for those times the init container failed to connect to ArangoDB, the env variables were not correctly set. Therefore, I added another init container before that (since init containers are executed in sequence), that'd wait for the corresponding kubernetes "service" resource of the ArangoDB deployment to come up. That way, by the time the second init container would run, the env variables would be available.
The corresponding portion of kubernetes deployment YAML is shown as:
apiVersion: apps/v1
kind: Deployment
metadata:
name: carservice
template:
spec:
initContainers:
- name:init-db-service
image: busybox:1.28
command: ['sh', '-c', 'until nslookup carservice-carservicedb; do echo waiting for kubernetes service resource for db; sleep 2; done;']
- name: init-carservice
image: arangodb/arangodb:3.5.1
command: ['sh', 'c', 'arangosh --server.endpoint="https://${CARSERVICE_CARSERVICEDB_SERVICE_HOST}:${CARSERVICE_CARSERVICEDB_SERVICE_PORT}" --server.password=""; do echo waiting for database to be up; sleep 2; done;']
containers:
- name: carservice
image: carservice
imagePullPolicy: IfNotPresent

Related

Ensure pod deletion when one of container terminates [duplicate]

I have a Kubernetes JOB that does database migrations on a CloudSQL database.
One way to access the CloudSQL database from GKE is to use the CloudSQL-proxy container and then connect via localhost. Great - that's working so far. But because I'm doing this inside a K8s JOB the job is not marked as successfully finished because the proxy keeps on running.
$ kubectrl get po
NAME READY STATUS RESTARTS AGE
db-migrations-c1a547 1/2 Completed 0 1m
Even though the output says 'completed' one of the initially two containers is still running - the proxy.
How can I make the proxy exit on completing the migrations inside container 1?
The best way I have found is to share the process namespace between containers and use the SYS_PTRACE securityContext capability to allow you to kill the sidecar.
apiVersion: batch/v1
kind: Job
metadata:
name: my-db-job
spec:
template:
spec:
restartPolicy: OnFailure
shareProcessNamespace: true
containers:
- name: my-db-job-migrations
command: ["/bin/sh", "-c"]
args:
- |
<your migration commands>;
sql_proxy_pid=$(pgrep cloud_sql_proxy) && kill -INT $sql_proxy_pid;
securityContext:
capabilities:
add:
- SYS_PTRACE
- name: cloudsql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.17
command:
- "/cloud_sql_proxy"
args:
- "-instances=$(DB_CONNECTION_NAME)=tcp:5432"
One possible solution would be a separate cloudsql-proxy deployment with a matching service. You would then only need your migration container inside the job that connects to your proxy service.
This comes with some downsides:
higher network latency, no pod local mysql communication
possible security issue if you provide the sql port to your whole kubernetes cluster
If you want to open cloudsql-proxy to the whole cluster you have to replace tcp:3306 with tcp:0.0.0.0:3306 in the -instance parameter on the cloudsql-proxy.
There are 3 ways of doing this.
1- Use private IP to connect your K8s job to Cloud SQL, as described by #newoxo in one of the answers. To do that, your cluster needs to be a VPC-native cluster. Mine wasn't and I was not whiling to move all my stuff to a new cluster. So I wasn't able to do this.
2- Put the Cloud SQL Proxy container in a separate deployment with a service, as described by #Christian Kohler. This looks like a good approach, but it is not recommended by Google Cloud Support.
I was about to head in this direction (solution #2) but I decided to try something else.
And here is the solution that worked for me:
3- You can communicate between different containers in the same Pod/Job using the file system. The idea is to tell the Cloud SQL Proxy container when the main job is done, and then kill the cloud sql proxy. Here is how to do it:
In the yaml file (my-job.yaml)
apiVersion: v1
kind: Pod
metadata:
name: my-job-pod
labels:
app: my-job-app
spec:
restartPolicy: OnFailure
containers:
- name: my-job-app-container
image: my-job-image:0.1
command: ["/bin/bash", "-c"]
args:
- |
trap "touch /lifecycle/main-terminated" EXIT
{ your job commands here }
volumeMounts:
- name: lifecycle
mountPath: /lifecycle
- name: cloudsql-proxy-container
image: gcr.io/cloudsql-docker/gce-proxy:1.11
command: ["/bin/sh", "-c"]
args:
- |
/cloud_sql_proxy -instances={ your instance name }=tcp:3306 -credential_file=/secrets/cloudsql/credentials.json &
PID=$!
while true
do
if [[ -f "/lifecycle/main-terminated" ]]
then
kill $PID
exit 0
fi
sleep 1
done
securityContext:
runAsUser: 2 # non-root user
allowPrivilegeEscalation: false
volumeMounts:
- name: cloudsql-instance-credentials
mountPath: /secrets/cloudsql
readOnly: true
- name: lifecycle
mountPath: /lifecycle
volumes:
- name: cloudsql-instance-credentials
secret:
secretName: cloudsql-instance-credentials
- name: lifecycle
emptyDir:
Basically, when your main job is done, it will create a file in /lifecycle that will be identified by the watcher added to the cloud-sql-proxy container, which will kill the proxy and terminate the container.
I hope it helps! Let me know if you have any questions.
Based on: https://stackoverflow.com/a/52156131/7747292
Doesn't look like Kubernetes can do this alone, you would need to manually kill the proxy once the migration exits. Similar question asked here: Sidecar containers in Kubernetes Jobs?
Google cloud sql has recently launched private ip address connectivity for cloudsql. If the cloud sql instance and kubernetes cluster is in same region you can connect to cloudsql without using cloud sql proxy.
https://cloud.google.com/sql/docs/mysql/connect-kubernetes-engine#private-ip
A possible solution would be to set the concurrencyPolicy: Replace in the job spec ... this will agnostically replace the current pod with the new instance whenever it needs to run again. But, you have to make sure that the subsequent cron runs are separated enough.
Unfortunately the other answers weren't working for me because of CloudSQLProxy running in a distroless environment where there is no shell.
I managed to get around this by bundling a CloudSQLProxy binary with my deployment and running a bash script to start up CloudSQLProxy followed by my app.
Dockerfile:
FROM golang:1.19.4
RUN apt update
COPY . /etc/mycode/
WORKDIR /etc/mycode
RUN chmod u+x ./scripts/run_migrations.sh
RUN chmod u+x ./bin/cloud_sql_proxy.linux-amd64
RUN go install
ENTRYPOINT ["./scripts/run_migrations.sh"]
Shell Script (run_migrations.sh):
#!/bin/sh
# This script is run from the parent directory
dbConnectionString=$1
cloudSQLProxyPort=$2
echo "Starting Cloud SQL Proxy"
./bin/cloud_sql_proxy.linux-amd64 -instances=${dbConnectionString}=tcp:5432 -enable_iam_login -structured_logs &
CHILD_PID=$!
echo "CloudSQLProxy PID: $CHILD_PID"
echo "Migrating DB..."
go run ./db/migrations/main.go
MAIN_EXIT_CODE=$?
kill $CHILD_PID;
echo "Migrations complete.";
exit $MAIN_EXIT_CODE
K8s (via Pulumi):
import * as k8s from '#pulumi/kubernetes'
const jobDBMigrations = new k8s.batch.v1.Job("job-db-migrations", {
metadata: {
namespace: namespaceName,
labels: appLabels,
},
spec: {
backoffLimit: 4,
template: {
spec: {
containers: [
{
image: pulumi.interpolate`gcr.io/${gcpProject}/${migrationsId}:${migrationsVersion}`,
name: "server-db-migration",
args: [
dbConnectionString,
],
},
],
restartPolicy: "Never",
serviceAccount: k8sSAMigration.metadata.name,
},
},
},
},
{
provider: clusterProvider,
});

Handling cronjobs in a Pod with multiple containers

I have a requirement in which I need to create a cronjob in kubernetes but the pod is having multiple containers (with single container its working fine).
Is it possible?
The requirement is something like this:
1. First container: Run the shell script to do a job.
2. Second container: run fluentbit conf to parse the log and send it.
Previously I thought to have a deployment in place and that is working fine but since that deployment was used just for 10 mins jobs I thought to make it a cron job.
Any help is really appreciated.
Also about the cronjob I am not sure if a pod can support multiple containers to do that same.
Thank you,
Sunny
Yes you can create a cronjob with multiple containers. CronJob is an abstraction on top of pod. So in the pod spec you can have multiple containers just like you can have in a normal pod. As an example
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
namespace: default
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
- name: app
image: alpine
command:
- echo
- Hello World!
restartPolicy: OnFailure
I need to agree with the answer provided by #Arghya Sadhu. It shows how you can run multi container Pod with a CronJob. Before the answer I would like to give more attention to the comment provided by #Chris Stryczynski:
It's not clear whether the containers are run in parallel or sequentially
It is not entirely clear if the workload that you are trying to run:
The requirement is something like this:
First container: Run the shell script to do a job.
Second container: run fluentbit conf to parse the log and send it.
could be used in parallel (both running at the same time) or require sequential approach (after X completed successfully, run Y).
If the workload could be run in parallel the answer provided by #Arghya Sadhu is correct, however if one workload is depending on another, I'd reckon you should be using initContainers instead of multi container Pods.
The example of a CronJob that implements the initContainer could be following:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
restartPolicy: Never
containers:
- name: ubuntu
image: ubuntu
command: [/bin/bash]
args: ["-c","cat /data/hello_there.txt"]
volumeMounts:
- name: data-dir
mountPath: /data
initContainers:
- name: echo
image: busybox
command: ["bin/sh"]
args: ["-c", "echo 'General Kenobi!' > /data/hello_there.txt"]
volumeMounts:
- name: data-dir
mountPath: "/data"
volumes:
- name: data-dir
emptyDir: {}
This CronJob will write a specific text to a file with an initContainer and then a "main" container will display its result. It's worth to mention that the main container will not start if the initContainer won't succeed with its operations.
$ kubectl logs hello-1234567890-abcde
General Kenobi!
Additional resources:
Linchpiner.github.io: K8S multi container pods
Whats about sidecar container for logging as second container which keep running without exit code. Even the job might run the state of the job still failed.

Not able to start apache-nifi in aks

Hi all I am working on Nifi and I am trying to install it in AKS (Azure kubernetes service).
Using nifi 1.9.2 version. While installing it in AKS gives me an error
replacing target file /opt/nifi/nifi-current/conf/nifi.properties
sed: preserving permissions for ‘/opt/nifi/nifi-current/conf/sedSFiVwC’: Operation not permitted
replacing target file /opt/nifi/nifi-current/conf/nifi.properties
sed: preserving permissions for ‘/opt/nifi/nifi-current/conf/sedK3S1JJ’: Operation not permitted
replacing target file /opt/nifi/nifi-current/conf/nifi.properties
sed: preserving permissions for ‘/opt/nifi/nifi-current/conf/sedbcm91T’: Operation not permitted
replacing target file /opt/nifi/nifi-current/conf/nifi.properties
sed: preserving permissions for ‘/opt/nifi/nifi-current/conf/sedIuYSe1’: Operation not permitted
NiFi running with PID 28.
The specified run.as user nifi
does not exist. Exiting.
Received trapped signal, beginning shutdown...
Below is my nifi.yml file
apiVersion: apps/v1
kind: Deployment
metadata:
name: nifi-core
spec:
replicas: 1
selector:
matchLabels:
app: nifi-core
template:
metadata:
labels:
app: nifi-core
spec:
containers:
- name: nifi-core
image: my-azurecr.io/nifi-core-prod:1.9.2
env:
- name: NIFI_WEB_HTTP_PORT
value: "8080"
- name: NIFI_VARIABLE_REGISTRY_PROPERTIES
value: "./conf/custom.properties"
resources:
requests:
cpu: "6"
memory: 12Gi
limits:
cpu: "6"
memory: 12Gi
ports:
- containerPort: 8080
volumeMounts:
- name: my-nifi-core-conf
mountPath: /opt/nifi/nifi-current/conf
volumes:
- name: my-nifi-core-conf
azureFile:
shareName: my-file-nifi-core/nifi/conf
secretName: my-nifi-secret
readOnly: false
I have some customization in nifi Dockerfile, which copies some config files related to my configuration. When I ran my-azurecr.io/nifi-core-prod:1.9.2 docker image on my local it works as expected
But when I try to run it on AKS its giving above error. since its related to permissions I have tried with both user nifi and root in Dockerfile.
All the required configuration files are provided in volume my-nifi-core-conf running in same resourse group.
Since I am starting nifi with docker my exception is, it will behave same regardless of environment. Either on my local or in AKS.
But error also say user nifi does not exist. The official nifi-image setup the user requirement.
Can anyone help, I cant event start container in interaction mode as pods in not in running mode. Thanks in advance.
I think your missing the Security Context definition for your Kubernetes Pod. The user that Nifi runs under within a Docker has a specific UID and GID, and with the error message you getting, I would suspect that because that user is not defined in the Pod's security context it's not launching as expected.
Have a look at section on the Kubernetes documentation about security contexts, and that should be enough get you started.
I would also have a look at using something like Minikube when testing Kubernetes deployments as Kubernetes adds a large number of controls around a container engine like Docker.
Security Contexts Docs: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
Minikube: https://kubernetes.io/docs/setup/learning-environment/minikube/
If you never figured this out, I was able to do this by running an initContainer before the main container, and changing the directory perms there.
initContainers:
- name: init1
image: busybox:1.28
volumeMounts:
- name: nifi-pvc
mountPath: "/opt/nifi/nifi-current"
command: ["sh", "-c", "chown -R 1000:1000 /opt/nifi/nifi-current"] #or whatever you want to do as root
update: does not work with nifi 1.14.0 - works with 1.13.2

How to create a mongo database per service with Docker

I am working towards having multiple services (NodeJS, Spring-boot) that each have their own MongoDB Database-server-per-service (eventually targeting GCP & K8s) so that I can keep the data separate. I will be using Docker compose to launch both the service and database together. However, when I run multiple services, naturally I get port collision. Here is a typical docker-compose file:
version: '3'
# Define the services/containers to be run
services:
myapp: #name of your service
build: ./ # specify the directory of the Dockerfile
ports:
- "3000:3000" #specify ports forwarding
links:
- database # link this service to the database service
volumes:
- .:/usr/src/app
depends_on:
- database
database: # name of the service
image: mongo # specify image to build container from
volumes:
- ./data:/data/db
ports:
- "27017:27017"
I am looking for an example of how to do this. My thinking is that each compose file will have it's own ports and each service will map to those ports internally?
You can make yaml for deployment and explain all your containers in one pod (a Pod is a group of containers). Your deployment may look like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: application-deployment
labels:
app: application
spec:
selector:
matchLabels:
app: application
template:
metadata:
labels:
app: application
spec:
containers:
- name: application
image: application:version
ports:
- containerPort: 3000
name: database
image: database:version
ports:
- containerPort: 27017
It is just deployment inside your cluster. You need to expose it outside the cluster. I recommend you to use Ingress for that.
Here you will have the database inside the pod. Also you can create 2 deployments for database and your app in the same namespace.
Also, you need to build Docker images manually or use the CI tool for that. You can manage environments ( prod, pre-prod, dev, test ) by namespaces. One namespace for one environment will give you full isolation. Also, to manage all this, I recommend you to use tools like Helm or kops.
There are a lot of differences between Kubernetes and Docker-compose, but the main difference is design. In Kubernetes, you have more entities for each level of application, and you can manage them. In Docker-compose, you configure all as one service in one place and usually it is hard to manage some specific things.

What is the equivalent for depends_on in kubernetes

I have a docker compose file with the following entries
version: '2.1'
services:
mysql:
container_name: mysql
image: mysql:latest
volumes:
- ./mysqldata:/var/lib/mysql
environment:
MYSQL_ROOT_PASSWORD: 'password'
ports:
- '3306:3306'
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3306"]
interval: 30s
timeout: 10s
retries: 5
test1:
container_name: test1
image: test1:latest
ports:
- '4884:4884'
- '8443'
depends_on:
mysql:
condition: service_healthy
links:
- mysql
The Test-1 container is dependent on mysql and it needs to be up and running.
In docker this can be controlled using health check and depends_on attributes.
The health check equivalent in kubernetes is readinessprobe which i have already created but how do we control the container startup in the pod's?????
Any directions on this is greatly appreciated.
My Kubernetes file:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: deployment
spec:
replicas: 1
template:
metadata:
labels:
app: deployment
spec:
containers:
- name: mysqldb
image: "dockerregistry:mysqldatabase"
imagePullPolicy: Always
ports:
- containerPort: 3306
readinessProbe:
tcpSocket:
port: 3306
initialDelaySeconds: 15
periodSeconds: 10
- name: test1
image: "dockerregistry::test1"
imagePullPolicy: Always
ports:
- containerPort: 3000
That's the beauty of Docker Compose and Docker Swarm... Their simplicity.
We came across this same Kubernetes shortcoming when deploying the ELK stack.
We solved it by using a side-car (initContainer), which is just another container in the same pod thats run first, and when it's complete, kubernetes automatically starts the [main] container. We made it a simple shell script that is in loop until Elasticsearch is up and running, then it exits and Kibana's container starts.
Below is an example of a side-car that waits until Grafana is ready.
Add this 'initContainer' block just above your other containers in the Pod:
spec:
initContainers:
- name: wait-for-grafana
image: darthcabs/tiny-tools:1
args:
- /bin/bash
- -c
- >
set -x;
while [[ "$(curl -s -o /dev/null -w ''%{http_code}'' http://grafana:3000/login)" != "200" ]]; do
echo '.'
sleep 15;
done
containers:
.
.
(your other containers)
.
.
This was purposefully left out. The reason being is that applications should be responsible for their connect/re-connect logic for connecting to service(s) such as a database. This is outside the scope of Kubernetes.
While I don't know the direct answer to your question except this link (k8s-AppController), I don't think it's wise to use same deployment for DB and app. Because you are tightly coupling your db with app and loosing awesome k8s option to scale any one of them as needed. Further more if your db pod dies you loose your data as well.
Personally what I would do is to have a separate StatefulSet with Persistent Volume for database and Deployment for app and use Service to make sure their communication.
Yes I have to run few different commands and may need at least two separate deployment files but this way I am decoupling them and can scale them as needed. And my data is being persistent as well!
As mentioned, you should run the database and the application containers in separate pods and connect them with a service.
Unfortunately, both Kubernetes and Helm don't provide a functionality similar to what you've described. We had many issues with that and tried a few approaches until we have decided to develop a smallish utility that solved this problem for us.
Here's the link to the tool we've developed: https://github.com/Opsfleet/depends-on
You can make pods wait until other pods become ready according to their readinessProbe configuration. It's very close to Docker's depends_on functionality.
In Kubernetes terminology one your docker-compose set is a Pod.
So, there is no depends_on equivalent there. Kubernetes will check all containers in a pod and they all have to be alive for a mark that pod as Healthy and will always run them together.
In your case, you need to prepare configuration of Deployment like that:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 1
template:
metadata:
labels:
app: app-and-db
spec:
containers:
- name: app
image: nginx
ports:
- containerPort: 80
- name: db
image: mysql
ports:
- containerPort: 3306
After pod will be started, your database will be available on localhost interface for your application, because of network conception:
Containers within a pod share an IP address and port space, and can find each other via localhost. They can also communicate with each other using standard inter-process communications like SystemV semaphores or POSIX shared memory.
But, as #leninhasda mentioned, it is not a good idea to run database and application in your pod and without Persistent Volume. Here is a good tutorial on how to run a stateful application in the Kubernetes.
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
what about liveness and readiness ??? supports commands, http requests and more
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-exec
spec:
containers:
- name: liveness
image: k8s.gcr.io/busybox
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5