Running a pod/container in Kubernetes that applies maintenance to a DB - kubernetes

I have found several people asking about how to start a container running a DB, then run a different container that runs maintenance/migration on the DB which then exits. Here are all of the solutions I've examined and what I think are the problems with each:
Init Containers - This wont work because these run before the main container is up and they block the starting of the main container until they successfully complete.
Post Start Hook - If the postStart hook could start containers rather than simply exec a command inside the container then this would work. Unfortunately, the container with the DB does not (and should not) contain the rather large maintenance application required to run it this way. This would be a violation of the principle that each component should do one thing and do it well.
Sidecar Pattern - This WOULD work if the restartPolicy were assignable or overridable at the container level rather than the pod level. In my case the maintenance container should terminate successfully before the pod is considered Running (just like would be the case if the postStart hook could run a container) while the DB container should Always restart.
Separate Pod - Running the maintenance as a separate pod can work, but the DB shouldn't be considered up until the maintenance runs. That means managing the Running state has to be done completely independently of Kubernetes. Every other container/pod in the system will have to do a custom check that the maintenance has run rather than a simple check that the DB is up.
Using a Job - Unless I misunderstand how these work, this would be equivalent to the above ("Separate Pod").
OnFailure restart policy with a Sidecar - This means using a restartPolicy of OnFailure for the POD but then hacking the DB container so that it always exits with an error. This is doable but obviously just a hacked workaround. EDIT: This also causes problems with the state of the POD. When the maintenance runs and stays up and both containers are running, the state of the POD is Ready, but once the maintenance container exits, even with a SUCCESS (0 exit code), the state of the POD goes to NotReady 1/2.
Is there an option I've overlooked or something I'm missing about the above solutions? Thanks.

One option would be to use the Sidecar pattern with 2 slight changes to the approach you described:
after the maintenance command is executed, you keep the container running with a while : ; do sleep 86400; done command or something similar.
You set an appropriate startupProbe in place that resolves successfully only when your maintenance command is executed successfully. You could for example create a file /maintenance-done and use a startupProbe like this:
startupProbe:
exec:
command:
- cat
- /maintenance-done
initialDelaySeconds: 5
periodSeconds: 5
With this approach you have the following outcome:
Having the same restartPolicy for both your database and sidecar containers works fine thanks to the sleep hack.
You Pod only becomes ready when both containers are ready. In the sidecar container case this happens when the startupProbe succeedes.
Furthermore, there will be no noticeable overhead in your pod: even if the sidecar container keeps running, it will consume close to zero resources since it is only running the sleep command.

Related

Use docker-compose to execute a script right before the container is stopped

Let's say I want to execute a cleanup script whenever container termination is triggered. How do I go about this using docker-compose?
This could be handy to automatically back up the files, databases, etc for the dev container.
docker containers are meant to be ephemeral:
By "ephemeral", we mean that the container can be stopped and destroyed, then rebuilt and replaced with an absolute minimum set up and configuration.
Building upon this concept docker itself does not offer anything to hook into the shutdown process. docker-compose is built on top of docker and also does not add such functionality.
Maybe you can rethink your problem the docker way to better fit the intended use of docker. Without further context it is hard to say what could be a good solution but maybe one of the following approaches helps you out:
docker stop sends a SIGTERM signal to the main process in the container. You could use a custom entrypoint or supervisor process that would trigger the appropriate actions on a SIGTERM. This approach requires custom containers. With the stop_signal attribute you can also configure a custom signa to be sent in your docker-compose.yml
if you just want to persist data files from the containers just configuring the right volumes might be enough
you could use docker events to listen and act upon any types of events emitted by the docker daemon

Kubernetes - run job after pod status is ready

I am basically looking for mechanics similair to init containers with a caveat, that I want it to run after pod is ready (responds to readinessProbe for instance). Are there any hooks that can be applied to readinessProbe, so that it can fire a job after first sucessfull probe?
thanks in advance
you can use some short lifecycle hook to pod or say container.
for example
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "echo In postStart > /dev/termination-log"]
it's postStart hook so I think it will work.
But Post hook is async function so as soon as container started it will be triggered sometime may possible before the entry point of container it triggers.
Update
Above postStart runs as soon as the container is created not when it's Ready.
So I you are looking for when it become read you have to use either startup probe.
Startup probe is like Readiness & Liveness probe only but it's one time. Startup probe check for application Readiness once the application is Ready liveness probe takes it's place.
Read More about startup probe
So from the startup probe you can invoke the Job or Run any type of shell script file it will be one time, also it's after your application sends 200 to /healthz Endpoint.
startupProbe:
exec:
command:
- bin/bash
- -c
- ./run-after-ready.sh
failureThreshold: 30
periodSeconds: 10
file run-after-ready.sh in container
#!/bin/sh
curl -f -s -I "http://localhost/healthz" &>/dev/null && echo OK || echo FAIL
.
. #your extra code or logic, wait, sleep you can handle now everything
.
You can add more checks or conditions shell script if the application Ready or some API as per need.
I don't think there are anything in vanilla k8s that can achieve this right now. However there are 2 options to go about this:
If it is fine to retry the initialization task multiple times until it succeed then I would just start the task as a job at the same time as the pod you want to initialize. This is the easiest option but might be a bit slow though because of exponential backoff.
If it is critical that the initialization task only run after the pod is ready or if you want the task to not waste time failing and backoff a few times, then you should still run that task as a job but this time have it watch the pod in question using k8s api and execute the task as soon as the pod becomes ready.

wait for other deployments to start running before other can be created?

I am creating the deployments/services using REST APIs. I send POST request with bodies which contain the JSON objects which create the applications on Openshift. After I call all the APIs, these objects get instantiated.
I have 2 deployments which are dependent on mongodb deployment but this mongodb takes a little longer to start running, while the two deployments which are dependent on mongodb start running earlier. This breaks the code inside the 2 deployments as the mongodb connection fails(since it is not up yet).
There could be 2 possible way I can fix this problem.
I put a delay after i create mongodb deployment and recursively call the API to check it's status if it is running or not.
Just like we make changes in docker-compose, with the key, depends-on which tell the docker-compose that all the dependencies should be started first and then the dependent container.
Is there any way this could be achieved in openshift?
Instead of implementing complex logic for dependency handling, use health checking mechanism of Kubernetes. If your application starts and doesn't see Mongo DB, let it crash. Kubernetes will keep restarting it until Mongo DB comes online, and your application becomes healthy and serving as well. Kubernetes won't send traffic to not yet healthy instances.
Docs: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
Just like we make changes in docker-compose, with the key, depends-on which tell the docker-compose that all the dependencies should be started first and then the dependent container.
You might want to look into Init Containers for dependent container. They run to completion before container is actually started. Below excerpt is taken from referenced documentation (given below) for use cases that might be applicable to your issue:
They run to completion before any app Containers start, whereas app Containers run in parallel, so Init Containers provide an easy way to block or delay the startup of app Containers until some set of preconditions are met.
Examples
Here are some ideas for how to use Init Containers:
Wait for a service to be created with a shell command like:
for i in {1..100}; do sleep 1; if dig myservice; then exit 0; fi; done; exit 1
Register this Pod with a remote server from the downward API with a command like:
curl -X POST http://$MANAGEMENT_SERVICE_HOST:$MANAGEMENT_SERVICE_PORT/register -d ‘instance=$()&ip=$()’
Wait for some time before starting the app Container with a command like sleep 60.
Reference documentation:
https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
Alex has pointed out correct practice to follow with kubernetes. But if you still want directly depend on other pod phase you can use this pod-dependency-init-container that I have build. This will check if any pod with given labels is running before starting your pod.

sbt docker:Publish - app crashes but container doesn't

I'm building docker images for my Scala applications using the sbt-native-packager plugin. I noticed that when the process inside a container crashes (log shows Exception in thread "main"... and the process is definitely dead), the container is still "alive":
me#my-laptop$ docker exec 5cca ps
PID TTY TIME CMD
1 ? 00:00:08 java
152 ? 00:00:00 ps
The generated Dockerfile is:
FROM java:openjdk-8-jre
WORKDIR /opt/docker
ADD opt /opt
RUN ["chown", "-R", "daemon:daemon", "."]
USER daemon
ENTRYPOINT ["bin/the-app-name"]
CMD []
where bin/the-app-name is a pretty big auto-generated bash script that gathers all the necessary parameters (classpath, main class name, etc.) and runs the app using the java command. So my guess is that something about this setup makes docker consider the container to be "running" as long as the JVM is running, regardless of my code crashing...
Any idea how i can cause my container to exit when the app crashes?
When running naked pods this behavior is expected, because naked pods are not rescheduled in the event of node failure.
When you deploy the pod, do you set the restartPolicy to "Always", "OnFailure" or "Never"?
The current status of the pod might be "Ok" right now, but this does not necessarily mean that the pod was not restarted before.
Can you run kubectl get po and print the output to check if the pod was restarted or not?
Info on naked pods here: https://kubernetes.io/docs/concepts/configuration/overview/#naked-pods-vs-replication-controllers-and-jobs
More info on restart policy: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle
After some experimenting it looks like there's a thread-leak somewhere that prevents the application from exiting. I'm suspecting it may be coming from the akka ActorSystem but did not find it yet.
Either way, catching the exception on the main thread and calling System.exit(1) causes the java process to die and the container stops.

Amazon ECS two services, one exits, second one never started

this is my compose.yml:
exp_db:
image: <img>
cpu_shares: 100
mem_limit: 362144000
volumes_from:
- exp_db_data
exp_db_data:
image: <img>
cpu_shares: 100
mem_limit: 362144000
exp_db is supposed to start up postgres and exp_db_data is volume for postgres data.
When I want to run the task with:
ecs-cli compose --file compose.yml up
The task is stopped (exit 0). When I inspect the reason why it stopped, it says that Essential container in task exited. I'm not sure if the volume container is supposed to not exit. When using docker-compose on my local all works as expected. So what am I doing wrong?
I'm fairly new to docker, so I'm probably missing something or misunderstanding some fundamentals.
Thanks
I think this is what it was happening here: If the essential parameter of a container is marked as true in task definition, and that container fails or stops for any reason, all other containers that are part of the task are stopped. If the essential parameter of a container is marked as false, then its failure does not affect the rest of the containers in a task. If this parameter is omitted, a container is assumed to be essential. [1]
All tasks must have at least one essential container.
If you have an application that is composed of multiple containers, you should group containers that are used for a common purpose into components, and separate the different components into multiple task definitions. [2]
You should put multiple containers in the same task definition if:
Containers share a common lifecycle (that is, they should be launched and terminated together).
Containers are required to be run on the same underlying host (that is, one container references the other on a localhost port).
You want your containers to share resources.
Your containers share data volumes.
[1] https://docs.aws.amazon.com/AmazonECS/latest/userguide/task_definition_parameters.html
[2] https://docs.aws.amazon.com/AmazonECS/latest/userguide/application_architecture.html
This has happened when you run two or more service in the same task definition, you can change that behaviour but there should be one container that should keep your service up and running.
For example, If you have two containers, suppose
Container A (must be up and running)
Container B (ignore if its down)
You think Container B is not required for A, if B is down you do not want to restart A, all you need to set
"essential": false,
in the B container task definition.
This work for me with ECS agent 1.36.2.