Recreate container on log entry - docker-compose

I have a container that goes into a crash-restart loop, and it does it reliably after a log entry Adapter disconnected, stopping.
Currently the container is automatically restarting, but it's not enough to break this crash-restart loop.
If I recreate the container, (ie docker-compose down && docker-compose up), then the problem is resolved.
Is there a recommended method for recreating a container on failure instead of restarting it? Particularly after a specific log event.

Related

How to start a POD in Kubernetes when another blocks an important resource?

I'm getting stuck in the configuration of a deployment. The problem is the following.
The application in the deployment is using a database which is stored in a file. While this database is open, it's locked (there's no way for read/write access for many).
If I delete the running POD the new one can't get in ready state, because the database is still locked. I read about preStop-Hook and tried to use it without success.
I could delete the lock file, which seems to be pretty harsh. What's the right way to solve this in Kubernetes?
This really isn't different than running this process outside of Kubernetes. When the pod is killed, it will be given a chance to shutdown cleanly. So the lock should be cleaned up. If the lock isn't cleaned up, there's not a lot of ways you can determined if the lock remains because an unclean shutdown was made, or a node is unhealthy, or if there is a network partition. So deleting the lock at pod startup does seem to be unwise.
I think the first step for you should be trying to determine why this lock file isn't getting cleaned up correctly. (Rather than trying to address the symptom.)

Running a pod/container in Kubernetes that applies maintenance to a DB

I have found several people asking about how to start a container running a DB, then run a different container that runs maintenance/migration on the DB which then exits. Here are all of the solutions I've examined and what I think are the problems with each:
Init Containers - This wont work because these run before the main container is up and they block the starting of the main container until they successfully complete.
Post Start Hook - If the postStart hook could start containers rather than simply exec a command inside the container then this would work. Unfortunately, the container with the DB does not (and should not) contain the rather large maintenance application required to run it this way. This would be a violation of the principle that each component should do one thing and do it well.
Sidecar Pattern - This WOULD work if the restartPolicy were assignable or overridable at the container level rather than the pod level. In my case the maintenance container should terminate successfully before the pod is considered Running (just like would be the case if the postStart hook could run a container) while the DB container should Always restart.
Separate Pod - Running the maintenance as a separate pod can work, but the DB shouldn't be considered up until the maintenance runs. That means managing the Running state has to be done completely independently of Kubernetes. Every other container/pod in the system will have to do a custom check that the maintenance has run rather than a simple check that the DB is up.
Using a Job - Unless I misunderstand how these work, this would be equivalent to the above ("Separate Pod").
OnFailure restart policy with a Sidecar - This means using a restartPolicy of OnFailure for the POD but then hacking the DB container so that it always exits with an error. This is doable but obviously just a hacked workaround. EDIT: This also causes problems with the state of the POD. When the maintenance runs and stays up and both containers are running, the state of the POD is Ready, but once the maintenance container exits, even with a SUCCESS (0 exit code), the state of the POD goes to NotReady 1/2.
Is there an option I've overlooked or something I'm missing about the above solutions? Thanks.
One option would be to use the Sidecar pattern with 2 slight changes to the approach you described:
after the maintenance command is executed, you keep the container running with a while : ; do sleep 86400; done command or something similar.
You set an appropriate startupProbe in place that resolves successfully only when your maintenance command is executed successfully. You could for example create a file /maintenance-done and use a startupProbe like this:
startupProbe:
exec:
command:
- cat
- /maintenance-done
initialDelaySeconds: 5
periodSeconds: 5
With this approach you have the following outcome:
Having the same restartPolicy for both your database and sidecar containers works fine thanks to the sleep hack.
You Pod only becomes ready when both containers are ready. In the sidecar container case this happens when the startupProbe succeedes.
Furthermore, there will be no noticeable overhead in your pod: even if the sidecar container keeps running, it will consume close to zero resources since it is only running the sleep command.

Use docker-compose to execute a script right before the container is stopped

Let's say I want to execute a cleanup script whenever container termination is triggered. How do I go about this using docker-compose?
This could be handy to automatically back up the files, databases, etc for the dev container.
docker containers are meant to be ephemeral:
By "ephemeral", we mean that the container can be stopped and destroyed, then rebuilt and replaced with an absolute minimum set up and configuration.
Building upon this concept docker itself does not offer anything to hook into the shutdown process. docker-compose is built on top of docker and also does not add such functionality.
Maybe you can rethink your problem the docker way to better fit the intended use of docker. Without further context it is hard to say what could be a good solution but maybe one of the following approaches helps you out:
docker stop sends a SIGTERM signal to the main process in the container. You could use a custom entrypoint or supervisor process that would trigger the appropriate actions on a SIGTERM. This approach requires custom containers. With the stop_signal attribute you can also configure a custom signa to be sent in your docker-compose.yml
if you just want to persist data files from the containers just configuring the right volumes might be enough
you could use docker events to listen and act upon any types of events emitted by the docker daemon

sbt docker:Publish - app crashes but container doesn't

I'm building docker images for my Scala applications using the sbt-native-packager plugin. I noticed that when the process inside a container crashes (log shows Exception in thread "main"... and the process is definitely dead), the container is still "alive":
me#my-laptop$ docker exec 5cca ps
PID TTY TIME CMD
1 ? 00:00:08 java
152 ? 00:00:00 ps
The generated Dockerfile is:
FROM java:openjdk-8-jre
WORKDIR /opt/docker
ADD opt /opt
RUN ["chown", "-R", "daemon:daemon", "."]
USER daemon
ENTRYPOINT ["bin/the-app-name"]
CMD []
where bin/the-app-name is a pretty big auto-generated bash script that gathers all the necessary parameters (classpath, main class name, etc.) and runs the app using the java command. So my guess is that something about this setup makes docker consider the container to be "running" as long as the JVM is running, regardless of my code crashing...
Any idea how i can cause my container to exit when the app crashes?
When running naked pods this behavior is expected, because naked pods are not rescheduled in the event of node failure.
When you deploy the pod, do you set the restartPolicy to "Always", "OnFailure" or "Never"?
The current status of the pod might be "Ok" right now, but this does not necessarily mean that the pod was not restarted before.
Can you run kubectl get po and print the output to check if the pod was restarted or not?
Info on naked pods here: https://kubernetes.io/docs/concepts/configuration/overview/#naked-pods-vs-replication-controllers-and-jobs
More info on restart policy: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle
After some experimenting it looks like there's a thread-leak somewhere that prevents the application from exiting. I'm suspecting it may be coming from the akka ActorSystem but did not find it yet.
Either way, catching the exception on the main thread and calling System.exit(1) causes the java process to die and the container stops.

Is there a way to cancel a handler after it was notified?

I've got an enrvironment consisting of a stack of dockerized microapps, where some are dependant from others, linked to each other and communicate over http on the docker interface. My problem was that the docker-compose tracked only the docker-compose.yml file and recreated containers only when the docker-compose.yml has been changed.
With ansible i can finally start tracking config files, that get mounted as volumes inside the containers, so they can be deployed from templates - which works fantastically.
Before ansible I used to run:
docker-compose stop <app> && docker-compose rm -f <app> && docker-compose up -d
to refresh a single app when I knew the mounted file has been changed and the volumes needed to be refreshed.
I've defined multiple roles with the docker_service module for each app each one with its own handler that, when notified, runs the code above, to refresh that particular app.
The problem is, when multiple apps have their mounted files changed, ansible notifies each handler and each one gets executed which is not exactly the case i need as when the primary container (on which others depend) gets recreated the others don't need to because they have already been recreated, yet their handlers also are being executed. So my question is: is there a way to cancel a notified handler? I know about flush_handlers but that just executes notified handlers, not exactly what I need.
You can use conditionals in handlers.
Use a flag variable to indicate that some handlers shouldn't execute.
- name: restart myapp1
shell: docker ...
when: not block_apps_restart