Flyway - Springboot - Kubernetes - kubernetes

Do, still, I need to run the Flywydb image as Kubernetes Job to run database migrations from Springboot deployed in Kubernetes (EKS & Openshift)?
I see some references of Flyway being configured that way. However, in case of Springboot, Flyway migrations are run as part of start cycle.
Am I missing something?

You have multiple options: You can either use flyway as part of your application or you can use a separate container, for example as an initContainer in Kubernetes.
If you feel more comfortable with having flyway as part of your application and run in similarly to your local development environment, just go ahead.

Related

InitContainer or Helm Hook Database Migrations Rollback

I read a lot about best practices applying DataBase Migrations for Kubernetes Apps. Actually there are three common solutions:
Separate CI/CD stage for DB Migrations (usually we run it before deploy new App version but in some cases we can deploy first, it doesn't matter for current question). Just decouple app and migrations, and run them in different stages one by one.
InitContainer: we can run DB migraitions before main app container starts. It is a good solution, but we have to be careful to tune it to run migrations once for all pods and replicas, do not run on pod restarts and check probes (kubelet can kill container on timeout while migration still running)
Separate Kubernetes Job or Helm Hook. Is close to initContainer, cannot share data with app container (but it is not really necessary for DB Migrations, so it's ok). Must be careful with timeout too - Helm can kill Job before migration is completed.
But the question - how to apply rollbacks for these solutions???
Lets try to form my ideas:
Separate stage in CI/CD: we can save previous migration name, and then rollback to it in another stage. Pipeline: Migrations -> Deploy -> Test -> Rollback DB and ReDeploy
But for InitContainer and HelmHook I have to idea how to realise Rollback! Do we need additional containers for that? Does helm rollback affect DB too (don't think so). What are the best practices for that?
I will be very glad to any suggestions!
I started investigating this issue too, and it looks like all manuals, tutorials and practices are meant to be forward-only. Should you ever need to rollback a database migration, you face with the limitations of Helm rollback. There is an open issue in Helm (https://github.com/helm/helm/issues/5825) that addresses this very problem, but it looks like no solution so far.
So probably Helm rollback is not a suitable mechanism for databases as you end up with creating a batch/v1.Job with container image that knows nothing about the changes you need to rollback. I use Liquibase and it requires that the changesets you need to rollback are present in the changelog.
I think for now this problem can be solved with forward-only approach:
Build the container image that introduces a new changeset into your database.
In release values for Helm specify the container image that should be run to perform a migration, and a command (like liquibase update).
If the migration fails and you see that rollback is required, you do not invoke helm rollback. You deploy a new release, and in its values you specify the container image that should perform the database migration rollback, and the command (like liquibase rollback <tag>). I am myself going this way now and it looks like the best solution I could came up with.
Hope that helps.

Mongodb replicaset with init scripts in docker-entrypoint-initdb.d

I'm working on trying to get a MongoDB replicaset deployed into Kubernetes with a default set of collections and data. The Kubernetes piece isn't too pertinent but I wanted to provide that for background.
Essentially in our environment we have a set of collections and data in the form of .js scripts that we currently build into our MongoDB image by copying them into /docker-entrypoint-initdb.d/. This works well in our current use case where we're only deploying MongoDB as a single container using Docker. Along with revamping our entire deployment process to deploy our application into Kubernetes, I need to get MongoDB deployed in a replicaset (with persistent storage) for obvious reasons such as failover.
The issue I've run into and found recognized elsewhere such as this issue https://github.com/docker-library/mongo/issues/339 is that scripts in /docker-entrypoint-initdb.d/ do not run in the same manner when configuring a replicaset. I've attempted a few other things such as running a seed container after the mongo replicaset is initialized, building our image with the collections and data on a different volume (such as /data/db2) so that it persists once the build is finished, and a variety of scripts such as those in the github link above. All of these either don't work or feel very "hacky" and I don't particularly feel comfortable deploying these to customer environments.
Unfortunately I'm a bit limited with toolsets and have not been approved to use a cloud offering like MongoDB Atlas or tooling such as the Enterprise Kubernetes Operator. Is there any real supported method for this use case or is the supported method to use a cloud offering or one of the MondoDB operators?
Thanks in advance!

How to use Docker in the development/deployment workflow?

I'm not sure I completely understand the role of Docker in the process of development and deployment.
Say, I create a Dockerfile with nginx, some database and something else which creates a container and runs fine.
I drop it somewhere in the cloud and execute it to install and configure all the dependencies and environment settings.
Next, I have a repository with a web application which I want to run inside the container I created and deployed in the first 2 steps. I regularly work on it and push the changes.
Now, how do I integrate the web application into the container?
Do I put it as a dependency inside the Dockerfile I create in the 1st step and recreate the container each time from scratch?
Or, do I deploy the container once but have procedures inside Dockerfile that install utils that pull the code from repo by command or via hooks?
What if a container is running but I want to change some settings of, say, nginx? Do I add these changes into Dockerfile and recreate the image?
In general, what's the role of Docker in the daily app development routine? Is it used often if the infrastructure is running fine and only code is changing?
I think there is no singl "use only this" answer - as you already outlined, there are different viable concepts available.
Deployment to staging/production/pre-production
a)
Do I put it as a dependency inside the Dockerfile I create in the 1st step and recreate the container each time from scratch?
This is for sure the most docker`ish way and aligns fully with he docker-philosophy. It is highly portable, reproducible and suites anything, from one container to "swarm" thousands of. E.g. this concept has no issue suddenly scaling horizontally when you need more containers, lets say due to heavy traffic / load.
It also aligns with the idea that only the configuration/data should be dynamic in a docker container, not code / binaries /artifacts
This strategy should be chosen for production use, so when not as frequent deployments happen. If you care about downtimes during container-rebuilds (on upgrade), there are good concepts to deal with that too.
We use this for production and pre-production intances.
b)
Or, do I deploy the container once but have procedures inside
Dockerfile that install utils that pull the code from repo by command
or via hooks?
This is a more common practice for very frequent deployment. You can go the pull ( what you said ) or the push (docker cp / ssh scp) concept, while i guess the latter is preferred in this kind of environment.
We use this for any kind strategy for staging instances, which basically should reflect the current "codebase" and its status. We also use this for smoke-tests and CI, but depending on the application. If the app actually changes its dependencies a lot and a clean build requires a rebuild with those to really ensure stuff is tested as it is supposed to, we actually rebuild the image during CI.
Configuration management
1.
What if a container is running but I want to change some settings of,
say, nginx? Do I add these changes into Dockerfile and recreate the
image?
I am not using this as c) since this is configuration management, not applications deployment and the answer to this can be very complicated, depending on your case. In general, if redeployment needs configuration changes, it depends on your configuration management, if you can go with b) or always have to go a).
E.g. if you use https://github.com/markround/tiller with consul as the backend, you can push the configuration changes into consul, regenerating the configuration with tiller, while using consul watch -prefix /configuration tiller as a watch-task to react on those value changes.
This enables you to go b) and fix the configuration
You can also use https://github.com/markround/tiller and on deployment, e.g. change ENV vars or some kind of yml file ( tiller supports different backends ), and call tiller during deployment yourself. This most probably needs you to have ssh or you ssh on the host and use docker cp and docker exec
Development
In development, you generally reuse your docker-compose.yml file you use for production, but overload it with docker-compose-dev.yml to e.g. mount your code-folder, set RAILS_ENV=development, reconfigurat / mount some other configurations like xdebug or more verbose nginx loggin, whatever you need. You can also add some fake MTA-services like fermata and so on
docker-compose -f docker-compose.yml -f docker-compose-dev.yml up
docker-compose-dev.yml only overloads some values, it does not redefine it or duplicate it.
Depending on how powerful your configuration management is, you can also do a pre-installation during development stack up.
We actually use scaffolding for that, we use https://github.com/xeger/docker-compose and after running it, we use docker exec and docker cp to preinstall a instance or stage something. Some examples are here https://github.com/EugenMayer/docker-sync/wiki/7.-Scripting-with-docker-sync
If you are developing under OSX and you face performance issues due to OSXFS / code shares, you probably want to have a look at http://docker-sync.io ( i am biased though )

Managing resources (database, elasticsearch, redis, etc) for tests using Docker and Jenkins

We need to use Jenkins to test some web apps that each need:
a database (postgres in our case)
a search service (ElasticSearch in our case, but only sometimes)
a cache server, such as redis
So far, we've just had these services running on the Jenkins master, but this causes problems when we want to upgrade Postgres, ES or Redis versions. Not all apps can move in lock step, and we want to run the tests on new versions before committing to move an app in production.
What we'd like to do is have these services provided on a per-job-run basis, each one running in its own container.
What's the best way to orchestrate these containers?
How do you start up these ancillary containers and tear them down, regardless of whether to job succeeds or not?
how do you prevent port collisions between, say, the database in a run of a job for one web app and the database in the job for another web app?
Check docker-compose and write a docker-compose file for your tests.
The latest network features of Docker (private network) will help you to isolate builds running in parallel.
However, start learning docker-compose as if you only had one build at the same time. When confident with this, look further for advanced docker documentation around networking.

Cyclic backups of a docker postgresql container

I would like to deploy an application using docker and would like to use a postgresql container to hold my data.
However I am worried about losing data, so I need back-ups.
I know I could run a cron job on the host to dump the data out from the container, however this approach is not containerized and when I deploy to a new location, I have to remember to add the cronjob.
What is a good , preferably containerized, approach to implement rotating data backups from a postgresql docker container?
Why not deploy a second container that is linked to the PostgreSQL one that does the backups?
It can contain a crontab within, together with instructions on how to upload the backup to Amazon S3, or some other secure storage in the cloud that will not fail even in case of an atomic war :)
Here's some basic information on linking containers: https://docs.docker.com/userguide/dockerlinks/
You can also use Docker Compose in order to deploy a fleet of containers (at least 2, in your case). If your "backup container" uploads stuff to the cloud, make sure you don't put your secrets (such as AWS keys) into the image at build time. Put them into the container at run-time. Here's more information on managing secrets using Docker.