Standalone MongoDB installation for Production - mongodb

I want to deploy MongoDB to Kubernetes cluster with 2 nodes, there is no chance to add another node in the future.
I want to deploy MongoDB as standalone because both node will be able to access to same disk space via NFS and I don't have requirements for replication or high availability. However, in the MongoDB docs, it is clearly stated that standalone deployment is not suitable for production environment.
MongoDB Deploy Standalone
You can deploy a standalone MongoDB instance for Cloud Manager to manage. Use standalone instances for testing and development. Do not use these deployments for production systems as they lack replication and high availability.
What kind of drawbacks I can face? Should I deploy as replica set with arbiter instance? If yes, why?

Of course you can deploy a Standalone MongoDB for production. But if this node fails, then your application is not available anymore. If you don't have any requirement for availability then go for a Standalone MongoDB.
However, running 2 MongoDB services which access the same physical disk (i.e. dbPath) will not work. Each MongoDB instance need to have a dedicated data folder.
In your case, I would suggest a Replica Set. All data from one node will be replicated to the other one. If one node fails then the application goes into "read/only" mode.
You can deploy an arbiter instance on the primary node. If the secondary node goes down, then the application is still fully available.

It is always recommended to deploy as replicaSet for production , however if you deploy as standalone and you have 2x kubernetes nodes , kubernetes can ensure there is always 1x running instance attached to the NFS storage in any of the available nodes , but the risk is that when the data on the storage is corrupted you will not have where to replicate from unless you do often backups and you dont care if you miss some recenly inserted data ...

Related

MariaDB Server vs MariaDB Galera Cluster HA Replication

I am planning to deploy HA database cluster on my kubernetes cluster. I am new to database and I am confused by the various database terms. I have decided on MariaDB and I have found two charts, MariaDB and MariaDB Galera Cluster.
I understand that both can achieve the same goal, but what are the main differences between the two? Under what scenario I should use either or?
Thanks in advance!
I'm not an expert so take my explanation with precaution (and double check it)
The main difference between the MariaDB's Chart and the MariaDB Galera Cluster's Chart is that the first one will deploy the standard master-slave (or primary-secondary) database, while the second one is a resilient master-master (or primary-primary) database cluster.
What does it means in more detail is the following:
MariaDB Chart will deploy a Master StatefulSet and a Slave StatefulSet which will spawn (with default values) one master Pod and 2 slave Pods. Once your database is up and running, you can connect to the master and write or read data, which is then replicated on the slaves, so that you have safe copies of your data available.
The copies can be used to read data, but only the master Pod can write new data in the database. Should the Pod crash.. or the Kubernetes cluster node where the Pod is running malfunction, you will not be able to write new data until the master's Pod is once more up and running (which may require manual intervention).. or if you perform a failover, promoting one of the other Pods to be the new temporary master (which also requires a manual intervention or some setup with proxies or virtual ips and so on).
Galera Cluster Chart instead, will deploy something more resilient. With default values, it will create a single StatefulSet with 3 Pods.. and each one of these Pods will be able to either read and write data, acting virtually as a master.
This means that if one of the Pods stop working for whatever reason, the other 2 will continue serving the database as if nothing happened, making the whole thing way more resilient. When the Pod (which stopped working) will come back up and running, it will obtain the new / different data from the other Pods, getting in sync.
In exchange for the resilience of the whole infrastructure (it would be too easy if the Galera Cluster solution would offer extreme resilience with no drawbacks), there are some cons in a multi-master application, with the more commons being some added latency in the operations, required to keep everything in sync and consistent.. and added complexity, which often may brings headaches.
There are several other limits with Galera Cluster, like explicit LOCKS of tables not working or that all tables must declare a primary key. You can find the full list here (https://mariadb.com/kb/en/mariadb-galera-cluster-known-limitations/)
Deciding between the two solutions mostly depends on the following question:
Do you have the necessity that, should one of your Kubernetes cluster node fail, the database keeps working (and being usable by your apps) like nothing happened, even if one of its Pods was running on that particular node?

HA postgresql on kubernetes

I wanted to deploy postgresql as database in my kubernetes cluster. As of now I've followed this tutorial.
By reading the whole thing I understood that we claimed a static storage before initiating the postgresql so that we have the data in case the pod fails. Also we can do replication by pointing to the same storage space to get our data back.
What happens if we use two workers nodes and the pods containing the database migrate to another node? I don’t think local storage will work.
hostPath volume is not recommended for production usage because of its ephemeral nature which means if the pod is rescheduled to another node the storage is not migrated and if the node reboots the data is lost.
For durable storage use external block or file storage systems mounted on the nodes using a supported CSI driver
For HA postgres I suggest you explore Postgres Operator which delivers an easy to run highly-available PostgreSQL clusters on Kubernetes (K8s) powered by Patroni. It is configured only through Postgres manifests (CRDs) to ease integration into automated CI/CD pipelines with no access to Kubernetes API directly, promoting infrastructure as code vs manual operations

Is there any way to deploy multi-container application in K8S single node for production?

What i want do is deployment of multiple container application in...
In RHEL os
RedHat Supportable product (if possible)
In single node K8S cluster (Bare metal machine)
So I found several way but I concerned about..
minikube, minishift, OKD, CodeReady Container
First, they run in VM but what I want is run in HOST.
Second, their doc said they are not for production environment.
So, Is there any PaaS for single-node cluster as production environment?
Docker, Docker-compose
Deployment target OS should maybe RHEL8. I guess it is not good idea to use docker because RedHat product is moving away from docker. Even in RHEL8 repository, there is no docker rpm for el8 yet.
My question is
Is there any PaaS for single-node cluster as production environment?
If not exist, docker-compose is best?
It was already mentioned, you should not use single node setup in production environment.
You should not do that because, if your servers drops you have service offline. There is nothing to switch to, nothing that might continue the process that was being worked on.
If you still want to setup a single node Kubernetes cluster you can do that using kubeadm. I think this would be closest to production grade as you can get.
Other then that as an alternative you can play with Installing Kubernetes with Minikube or Install a local Kubernetes with MicroK8s.
It's up to you which one you will choose but you need to remember this should not be running as a production, this should be a lab or a test environment which if works as expected will be migrated into few node production grade cluster.
As for PaaS as a single node there is Dokku.
Docker powered mini-Heroku. The smallest PaaS implementation you've ever seen.
And if you would consider using a cloud for PaaS, you can choose from AWS Cloud9, Azure App Service or Google App Engine.
Single node cluster is not recommended for production applications. You need scalability, high availability, fault tolerance for production apps. You must have more than one node to have these features.

GitLab HA with Kubernetes and Gluster

I currently have GitLab omnibus setup on Docker. I plan to have HA for the same by adding it to Kubernetes and have persistence using Gluster. I have played around configuring Kubernetes with Gluster. Now it's time to bring GitLab into Kubernetes. GitLab uses PostgreSQL as the default db.
My query is that to implement HA, should i
a) split GitLab into GitLab application and PostgreSQL container, and then run both (Application and DB) in their own cluster of pods i.e., separate deployments of replicas of GitLab app and PostgreSQL?
OR
b) keep using the omnibus installer and just have replicas of this single, standalone container?
Does it really make any difference whether
1) writes happen to a db cluster exposed via service to the GitLab app
OR
2) writes happening directly to the omnibus GitLab container (which has db within itself)
Just want to make sure that i don't unnecessarily end up making the setup complex. Having GitLab in Kubernetes along with Gluster already makes things a little complex. So does splitting app and db makes sense or just the omnibus setup will suffice? Concerned about concurrent writes to db.
According to http://docs.gitlab.com/ce/install/kubernetes/gitlab_omnibus.html#introduction you should use dedicated Redis and PostgreSQL HA clusters. Option b) and 1)
For less downtime better to use PostgreSQL master-slave cluster (https://www.postgresql.org/docs/10/static/different-replication-solutions.html) and Redis Cluster master-slave (https://redis.io/topics/cluster-tutorial). "Note that the minimal (Redis) cluster that works as expected requires to contain at least three master nodes".
If you will use only GlusterFS to bring failover to PostgreSQL, you can get some errors requires manual repair when one DB instance crashes and another brings up. Like this: How do I fix Postgres so it will start after an abrupt shutdown?

Kubernetes - Persistent storage for PostgreSQL

We currently have a 2-node Kubernetes environment running on bare-metal machines (no GCE) and now we wish to set up a PostgreSQL instance on top of this.
Our plan was to map a data volume for the PostgreSQL Data Directory to the node using the volumeMounts option in Kubernetes. However this would be a problem because if the Pod ever gets stopped, Kubernetes will re-launch it at random on one of the other nodes. Thus we have no guarantee that it will use the correct data directory on re-launch...
So what is the best approach for maintaining a consistent and persistent PostgreSQL Data Directory across a Kubernetes cluster?
one solution is to deploy HA postgresql, for example https://github.com/sorintlab/stolon
another is to have some network storage attached to all nodes(NFS, glusterFS) and use volumeMounts in the pods