How does kubernetes replication controller handle data? - kubernetes

If I setup a replication controller for something like a database, how does it keep the data in the replicas in-sync? If one of the replica goes down, how does it bring it back up with the latest data?

A replication controller ensures that the desired number of pods with the same template are kept running in the system. The replication controller itself does not know anything about what it is running, and doesn't have any special hooks for containers running databases. This means that if you want to run a container with a database with more than one replica, then it is easiest to run a database that can natively do replication and discovery (possibly with the injection of some environment variables).
An alternative is to run a pod with two containers, where one container is a vanilla database, and the second "side-car" container is used to implement the necessary replication / synchronization / master election or whatever extra functionality you need to provide to make the database run in a clustered environment. This is more flexible (you can run a database that wasn't initially designed to run in a clustered environment) but also requires more custom work to make it scale.

Related

Expressing that a service requires another

I'm new to k8s, so this question might be kind of weird, please correct me as necessary.
I have an application which requires a redis database. I know that I should configure it to connect to <redis service name>.<namespace> and the cluster DNS will get me to the right place, if it exists.
It feels to me like I want to express the relationship between the application and the database. Like I want to say that the application shouldn't be deployable until the database is there and working, and maybe that it's in an error state if the DB goes away. Is that something you'd normally do, and if so - how? I can think of other instances: like with an SQL database you might need to create the tables your app wants to use at init time.
Is the alternative to try to connect early and exit 1, so that the cluster keeps on retrying? Feels like that would work but it's not very declarative.
Design for resiliency
Modern applications and Kubernetes are (or should be) designed for resiliency. The applications should be designed without single point of failure and be resilient to changes in e.g. network topology. Also see Twelve factor-app: IV. Backing services.
This means that your Redis typically should be a cluster of e.g. 3 instances. It also means that your app should retry connections if connections fails - this can also happens same time after running - since upgrades of a cluster (or rolling upgrade of an app) is done by terminating one instance at a time meanwhile a new instance at a time is launched. E.g. the instance (of a cluster) that your app currently is connected to might go away and your app need to reconnect, perhaps establish a connection to a different instance in the same cluster.
SQL Databases and schemas
I can think of other instances: like with an SQL database you might need to create the tables your app wants to use at init time.
Yes, this is a different case. On Kubernetes your app is typically deployed with at least 2 replicas, or more (for high-availability reasons). You need to consider that when managing schema changes for your app. Common tools to manage the schema are Flyway or Liquibase and they can be run as Jobs. E.g. first launch a Job to create your DB-tables and after that deploy your app. And after some weeks you might want to change some tables and launch a new Job for this schema migration.
As you've seen, YAML objects can not express such dependencies. As suggested by #fabian-lopez, your application container may include an initContainer that would wait for dependencies to be available, before starting their main container.
Now, if you want a state machine, capable to provision a database, initialize its schema, maybe import some records, and only then create your application: you're looking for an operator. Then, you may use the operator-sdk ( https://github.com/operator-framework/operator-sdk ), or pretty much anything integrating with some Kubernetes cluster API.
I think Init Containers is something you could leverage for this use case
This is up to your application code, not something Kubernetes helps nor hinders.

Pgbouncer: how to run within a kubernetes cluster properly

The background: I currently run some kubernetes pods with a pgbouncer sidecar container. I’ve been running into annoying behavior with sidecars (that will be addressed in k8s 1.18) that have workarounds, but have brought up an earlier question around running pgbouncer inside k8s.
Many folks recommend the sidecar approach for pgbouncer, but I wonder why running one pgbouncer per say: machine in the k8s cluster wouldn’t be better? I admit I don’t have enough of a deep understanding of either pgbouncer or k8s networking to understand the implications of either approach.
EDIT:
Adding context, as it seems like my question wasn't clear enough.
I'm trying to decide between two approaches of running pgbouncer in a kubernetes cluster. The PostgreSQL server is not running in this cluster. The two approaches are:
Running pgbouncer as a sidecar container in all of my pods. I have a number of pods: some replicas on a webserver deployment, an async job deployment, and a couple cron jobs.
Running pgbouncer as a separate deployment. I'd plan on running 1 pgbouncer instance per node on the k8s cluster.
I worry that (1) will not scale well. If my PostgreSQL master has a max of 100 connections, and each pool has a max of 20 connections, I potentially risk saturating connections pretty early. Additionally, I risk saturating connections on master during pushes as new pgbouncer sidecars exist alongside the old image being removed.
I, however, almost never see (2) recommended. It seems like everyone recommends (1), but the drawbacks seem quite obvious to me. Is the networking penalty I'd incur by connecting to pgbouncer outside of my pod be large enough to notice? Is pgbouncer perhaps smart enough to deal with many other pgbouncer instances that could potentially saturate connections?
We run pgbouncer in production on Kubernetes. I expect the best way to do it is use-case dependent. We do not take the sidecar approach, but instead run pgbouncer as a separate "deployment", and it's accessed by the application via a "service". This is because for our use case, we have 1 postgres instance (i.e. one physical DB machine) and many copies of the same application accessing that same instance (but using different databases within that instance). Pgbouncer is used to manage the active connections resource. We are pooling connections independently for each application because the nature of our application is to have many concurrent connections and not too many transactions. We are currently running with 1 pod (no replicas) because that is acceptable for our use case if pgbouncer restarts quickly. Many applications all run their own pgbouncers and each application has multiple components that need to access the DB (so each pgbouncer is pooling connections of one instance of the application). It is done like this https://github.com/astronomer/airflow-chart/tree/master/templates/pgbouncer
The above does not include getting the credentials set up right for accessing the database. The above, linked template is expecting a secret to already exist. I expect you will need to adapt the template to your use case, but it should help you get the idea.
We have had some production concerns. Primarily we still need to do more investigation on how to replace or move pgbouncer without interrupting existing connections. We have found that the application's connection to pgbouncer is stateful (of course because it's pooling the transactions), so if pgbouncer container (pod) is swapped out behind the service for a new one, then existing connections are dropped from the application's perspective. This should be fine even running pgbouncer replicas if you have an application where you can ensure that rarely dropped connections retry and make use of Kubernetes sticky sessions on the "service". More investigation is still required by our organization to make it work perfectly.

MongoDB data replication in Kubernetes

I've been configuring pods in Kubernetes to hold a mongodb and golang image each with a service to load-balance. The major issue I am facing is data replication between databases. Replication controllers/replicasets do not seem to do what the name implies, but rather is a blank-slate copy instead of a replica of existing/currently running pods. I cannot seem to find any examples or clear answers on how Kubernetes addresses this, or does it even?
For example, data insertions being sent by the Go program are going to automatically load balance to one of X replicated instances of mongodb by the service. This poses problems since they will all be maintaining separate documents without any relation to one another once Kubernetes begins to balance the connections among other pods. Is there a way to address this in Kubernetes, or does it require a complete re-write of the Go code to expect data replication among numerous available databases?
Sorry, I'm relatively new to Kubernetes and couldn't seem to find much information regarding this.
You're right, a replica set is not a replica of another container, it's just a container with the same configuration spun up within the same logical unit.
A replica set (or deployment, which is the resource you should be using now) will have multiple pods, and it's up to you, the operator, to configure the mongodb part.
I would recommend reading this example of how to set up a replica set with multiple mongodb containers:
https://medium.com/google-cloud/mongodb-replica-sets-with-kubernetes-d96606bd9474#.e8y706grr

In a containerized cluster, should mongodb servers be running on a worker or a core service?

I'm trying to implement an architecture that's similar to the coreos's production architecture (shown below)
Should I run the database as a central service or one or more of the workers?
I figured the database needs some kind of replication, which makes me think that putting it in the worker cluster makes more sense, but I'm just not sure.
This should be run as a worker. The central services are the basic things that come with CoreOS (mainly etcd). The workers host your applications, the database being one of them. You do have a persistence issue because your database will have state to remember between restarts. So, there is a bigger issue of how do you make that persistence? One was to do it is use a host file and give the database an affinity to that host and mount the host file. Another thing you might consider is running more than one database (if your db technology supports that) and replicate that database so you have two (or more) copies in different workers. (non-affinity). If your database creates transaction logs that can be applied to a backup, you can manage those transaction logs in a worker.
Another thing to consider is not using a container for your database. The database is a weird animal, its care and feeding is not like the rest of the applications. So it is reasonable (in my opinion) to have your database managed and maintained outside the scope of your cluster (but still reachable by the cluster).

Migrating MongoDB instances with no down-time

We are using MongoDB in production environment and now, due to some issues of current servers, I'm going to change the server and start a new MongoDB instance.
We have a replica set and a single mongod instance (two different MongoDB networks for different purposes). Now, first I should migrate the single mongod instance and then the whole replica set to the new server.
What I want to know is, how can I migrate both instances with no down-time? I don't want to shutdown the server or stop write operations.
Thanks in advance.
So first of all you should never run mongodb as a single instance for production. At a minimum you should have 1 primary, 1 secondary and 1 arbiter.
Second, even with a replica set you will always have a bit of write downtime when you switch primaries, as writes are not possible during the election process. From the docs:
IMPORTANT Elections are essential for independent operation of a
replica set; however, elections take time to complete. While an
election is in process, the replica set has no primary and cannot
accept writes. MongoDB avoids elections unless necessary.
Elections are going to occur when for example you bring down the primary to move it to a new server or virtual instance, or upgrade the database version (like going from 2.4 to 2.6).
You can keep downtime to a minimum with an existing replica set by setting the appropriate options to allow queries to run against secondaries. Again from the docs:
Maintaining availability during a failover. Use primaryPreferred if
you want an application to read from the primary under normal
circumstances, but to allow stale reads from secondaries in an
emergency. This provides a “read-only mode” for your application
during a failover.
This takes care of reads at least. Writes are best dealt with by having your application retry failed writes, or queue them up.
Regarding your standalone the documented procedures for converting to a replica set are well tested and can be completed very quickly with minimal downtime:
http://docs.mongodb.org/manual/tutorial/convert-standalone-to-replica-set/
You cannot have no downtime (a new mongod will run on new IP so you need to at least connect to it). But you can minimize downtime by making geographically distributed replica set.
Please Read
http://docs.mongodb.org/manual/tutorial/deploy-geographically-distributed-replica-set/
Use the given process but please note:
Do not set priority 0 of instances at New Location so that they become primary when old ones at Old Location step down.
You still need to restart mongod in replica set mode at Old Location.
You need 3 instances including an arbiter at New Location, if you want it to be
replica set.
When complete data is in sync with instances at New Location, step down instances at Old Location (one by one). Now everything will go to New Location but the problem is that it is directed through a distant mongod.
So stop mongod at Old Location and start a new one at new Location. Connect your applications to New Location Mongod.
Note: I have not done the same so far. I had planned it once but then I got the problem and it was not of hosting provider. Practically you may get some issues.
Replica Set is the feature provided by the Mongodb database to achieve high availability and automatic failover.
It is kinda traditional master-slave configuration but have capability of automatic failover.
It is basically group/cluster of the mongod instances which communicates, replicates to each other to provide high availability and to do automatic failover
Basically, in replica sets there are minimum 2 and maximum of 12 mongod instances can exist
In replica set following types of server exist. out of all, one server is always primary.
http://blog.ajduke.in/2013/05/31/setup-mongodb-replica-set-in-4-steps/
John answer is right, btw in your case you have no way to avoid downtime, you can just try to make it shorter as possible.
You can prepare the new replica set and save its configuration.
Same for the single mongod instance, prepare a js file with specific configuration (ie: stuff going on the admin database).
disable client connections on production servers.
copy the datafiles from the old servers to the new ones (http://docs.mongodb.org/manual/core/backups/#backup-with-file-copies)
apply your previous saved replica set config and configuration.
done
you can use diffent ways as add an hidden secondary member on the replica set if you have a lot of data, so you can wait it's is up-to-date before stopping the production server. Basically for the replica set you have many ways to handle a migration, with the single instance instead you don't have such features.