Kubernetes - application-consistent backup - kubernetes

Is it possible to perform a backup on Kubernetes in application-consistent manner?
Some of the backup solutions I found mainly base on freezing the pod and then launching the backup to maintain consistency (Heptio's Ark for example.)

The idea of application-consistent backups is to capture all data in memory and all transactions in process. This is performed by using some type of client software co-resident with the database application to quiesce the database application, flush its memory cache, complete all its writes in order and then perform the backup.
In its turn, Kubernetes operates with specifications of resources (e.g., Deployments, Services, etc.) and their statuses, and in any given time the resource status must be the same as defined in the specification. For storing any important data in Kubernetes, persistent volumes are used. In other words, you cannot perform a backup in an application-consistent manner on Kubernetes, because the main idea of it is different.
It is possible that a specific application for a specific database exists and allows implementing such type of backup. But it is related to that application but not to Kubernetes itself.

Related

Do I gain anything by using "proper" replicas for a read-only MongoDB database?

I have a web-app that depends on a read-only MongoDB database. Through trial and error, I discovered that by far the fastest way to run the ETL pipeline that populates the database is to run a local copy of MongoDB, populate the database, stop the database, and tarball the state directory.
To deploy a high-availability "cluster," I create multiple instances (or containers) running the app, each with access to a copy of the state in locally mounted storage. Putting these behind a load balancer with regular health checks and autoscaling (or in a Kubernetes cluster as a ReplicaSet), I get isolation, redundancy, easy rollbacks (using versioned storage), and easy setup in virtually any environment.
The key idea here is that because the database is read-only, it is in a sense a "stateless" application. Thus, I can treat it like any other static provider of information
There are many apparent advantages to this setup. Nevertheless, I have always had a nagging feeling that I was missing something. Given a read-only context, is there still some reason why it might be better to run a "proper" MongoDB cluster?
If you don't mind outages when the single node goes down and you don't mind taking the system down during upgrades then this is probably an ok deployment. You might get a safer dump and restore using mongodump and mongorestore rather than tar but apart from that this setup should work for a read-only deployment.

Microservice Application ... Docker Volume for Databases or no Docker Volume?

I have an application (JHipster Gateway, UAA, Registry, 5 microservices) and each application source builds a Docker image and pushes to GitLab registry. Currently I'm running everything on Rancher using a Docker-Compose file. My volumes for Mongo databases are currently in each container.
I need advice about volume mounts. Here are my options as I see them.
Leave data in containers and monitor and backup
Use external mounts and monitor volumes on host.
If I leave Mongo data in the containers, do I just set up to just cluster and when the internal volumes fill, the database just scales? I am looking for some explanation to help my choice with Mongo database mounts, internal or external (on host)?
Thanks in advance,
David L. Whitehurst
Never store any data you care about directly in containers. There are good arguments in favor of both named volumes (native to Docker, some support in a multi-host Swarm environment, fewer host-specific dependencies) and host bind mounts (much easier to back up and maintain, possible to examine directly if needed) but use some sort of mounted storage.
The most important note here is that it's fairly routine to delete and recreate containers. If the software you're running or its underlying library stack has a security issue, you generally need to get (or build) an updated image, delete your existing container, and rebuild it against the new image. If data is stored only inside a container, then during this very routine delete-and-recreate operation, there's significant risk of losing data.
In principle, if you're really careful, and you have a replicated data store, you can roll this over without external volumes and not lose data. It's tricky, and takes a lot of patience; you'll be forced to take down one replica, wait for its data to be rebalanced across the other replicas, start up a new replica, wait for it to accept some of the data, and so on. If you can take a point release by stopping a container, deleting it, starting a new one with the same data store, and have it come up instantly with populated data, that's much easier to manage.
(The other corollary here is that you don't "back up containers", since they don't have any data you care about. You do back up the data stored on the host or in Docker named volumes, and you can always recreate the container from its image plus the external data.)

What are the challenges or problems in Backing up and restoring Open source softwares like openstack and kubernetes?

I would like to understand the existing problems in the field of openstack and kubernetes with respect to backup and restore. Any link or reference to any research related matter would also be helpful.
my USD$0.02: There are several projects that I know of which attempt to backup k8s clusters at the metadata level
We focus on etcd backups, since our risk is wider than just the k8s descriptors, but as the adage goes: "they're not backups until you've tested them," which I (unfortunately) have not made the time to try
One also needs to exercise caution because any cluster state backups will contain cleartext Secret descriptors; so any backup solution needs to be aware of encryption or to skip over those descriptors

In a containerized cluster, should mongodb servers be running on a worker or a core service?

I'm trying to implement an architecture that's similar to the coreos's production architecture (shown below)
Should I run the database as a central service or one or more of the workers?
I figured the database needs some kind of replication, which makes me think that putting it in the worker cluster makes more sense, but I'm just not sure.
This should be run as a worker. The central services are the basic things that come with CoreOS (mainly etcd). The workers host your applications, the database being one of them. You do have a persistence issue because your database will have state to remember between restarts. So, there is a bigger issue of how do you make that persistence? One was to do it is use a host file and give the database an affinity to that host and mount the host file. Another thing you might consider is running more than one database (if your db technology supports that) and replicate that database so you have two (or more) copies in different workers. (non-affinity). If your database creates transaction logs that can be applied to a backup, you can manage those transaction logs in a worker.
Another thing to consider is not using a container for your database. The database is a weird animal, its care and feeding is not like the rest of the applications. So it is reasonable (in my opinion) to have your database managed and maintained outside the scope of your cluster (but still reachable by the cluster).

Persistent storage for Apache Mesos

Recently I've discovered such a thing as a Apache Mesos.
It all looks amazingly in all that demos and examples. I could easily imagine how one would run for stateless jobs - that fits to the whole idea naturally.
Bot how to deal with long running jobs that are stateful?
Say, I have a cluster that consists of N machines (and that is scheduled via Marathon). And I want to run a postgresql server there.
That's it - at first I don't even want it to be highly available, but just simply a single job (actually Dockerized) that hosts a postgresql server.
1- How would one organize it? Constraint a server to a particular cluster node? Use some distributed FS?
2- DRBD, MooseFS, GlusterFS, NFS, CephFS, which one of those play well with Mesos and services like postgres? (I'm thinking here on the possibility that Mesos/marathon could relocate the service if goes down)
3- Please tell if my approach is wrong in terms of philosophy (DFS for data servers and some kind of switchover for servers like postgres on the top of Mesos)
Question largely copied from Persistent storage for Apache Mesos, asked by zerkms on Programmers Stack Exchange.
Excellent question. Here are a few upcoming features in Mesos to improve support for stateful services, and corresponding current workarounds.
Persistent volumes (0.23): When launching a task, you can create a volume that exists outside of the task's sandbox and will persist on the node even after the task dies/completes. When the task exits, its resources -- including the persistent volume -- can be offered back to the framework, so that the framework can launch the same task again, launch a recovery task, or launch a new task that consumes the previous task's output as its input.
Current workaround: Persist your state in some known location outside the sandbox, and have your tasks try to recover it manually. Maybe persist it in a distributed filesystem/database, so that it can be accessed from any node.
Disk Isolation (0.22): Enforce disk quota limits on sandboxes as well as persistent volumes. This ensures that your storage-heavy framework won't be able to clog up the disk and prevent other tasks from running.
Current workaround: Monitor disk usage out of band, and run periodic cleanup jobs.
Dynamic Reservations (0.23): Upon launching a task, you can reserve the resources your task uses (including persistent volumes) to guarantee that they are offered back to you upon task exit, instead of going to whichever framework is furthest below its fair share.
Current workaround: Use the slave's --resources flag to statically reserve resources for your framework upon slave startup.
As for your specific use case and questions:
1a) How would one organize it? You could do this with Marathon, perhaps creating a separate Marathon instance for your stateful services, so that you can create static reservations for the 'stateful' role, such that only the stateful Marathon will be guaranteed those resources.
1b) Constraint a server to a particular cluster node? You can do this easily in Marathon, constraining an application to a specific hostname, or any node with a specific attribute value (e.g. NFS_Access=true). See Marathon Constraints. If you only wanted to run your tasks on a specific set of nodes, you would only need to create the static reservations on those nodes. And if you need discoverability of those nodes, you should check out Mesos-DNS and/or Marathon's HAProxy integration.
1c) Use some distributed FS? The data replication provided by many distributed filesystems would guarantee that your data can survive the failure of any single node. Persisting to a DFS would also provide more flexibility in where you can schedule your tasks, although at the cost of the difference in latency between network and local disk. Mesos has built-in support for fetching binaries from HDFS uris, and many customers use HDFS for passing executor binaries, config files, and input data to the slaves where their tasks will run.
2) DRBD, MooseFS, GlusterFS, NFS, CephFS? I've heard of customers using CephFS, HDFS, and MapRFS with Mesos. NFS would seem an easy fit too. It really doesn't matter to Mesos what you use as long as your task knows how to access it from whatever node where it's placed.
Hope that helps!