Mongodb instance alone vs mongodb on kubernetes - mongodb

what is the better solution for running a mongodb instance? Lets say we have a running Kube cluster. MongoDb itself has its own clustering/sharding solution. We want this mongodb to grow in size and we expect it to get quite big, so we definitely need to use its sharding solution.
How does this fit into Kubernetes? Seems to me they don't really work well together? What I'm talking about is that Kubernetes "clones" pods over nodes, while the point of mongodb sharding is that you separate data over a cluster (not cloning data)? Am I wrong about something here?
Thank you for your input

We have been running a sharded cluster of Mongo and ES in production with TBs of data for 3 years and never faced replication or other scaling issues.
i would suggest checking out this official link form MongoDb : https://docs.mongodb.com/kubernetes-operator/master/tutorial/deploy-sharded-cluster/
Using this you can deploy the shareded cluster of MongoDB on Kubernetes.
Basically mongo has its own operator which will operate and manage the PODs for you if any data replication required scaling etc.
It's required first to before you setup the cluster : https://docs.mongodb.com/kubernetes-operator/master/tutorial/install-k8s-operator/
If you just want to use the helm chart check out this once : https://hub.kubeapps.com/charts/bitnami/mongodb-sharded
If you want to setup eveything by own you can refer the : https://medium.com/google-cloud/sharded-mongodb-in-kubernetes-statefulsets-on-gke-ba08c7c0c0b0
One of my fav channel: https://www.youtube.com/watch?v=7Lp6R4CmuKE

Related

MongoDB Atlas Replica Set

I am using MongoDB as my primary database with the Mongoose ODM. I am dealing with some database transactions and the only way to do achieve DB Transactions in MongoDB is to use replicaSets. I was able to achieve this in development mode using the run-rs package. However, in production mode, from the MongoDB docs using replicaSets requires that I set up k8s object with Atlas. But I am just a beginner at containers and orchestration. I tried learning about Docker, is it possible to setup a service that could run this replicaSets from docker? And is there any easier way I could setup replicaSets with MongoDB Atlas?
Thanks.
You can set up a one node replica set for development and testing.

What is the difference between a cluster and a replica set in mongodb atlas?

I'm taking the Mongodb University M103 course and over there they gave a brief overview of what a cluster and a replica set is.
From my understanding a cluster is a set of servers or nodes. While a replica set is a set of servers or nodes all of which has replication mechanism built into each of them for downtime access and faster read operation.
From that it seems that replica set is a specific type of cluster, but my confusion arises from MongoDB Atlas. In mongoDB atlas one has to create a cluster, is that a replica set as well?
Are those terms interchangeable in all scenarios?
Replica Set
In MongoDB, a replicaset is a concept that depicts a set of MongoDB server working in providing redundancy (all the servers in the replica set have the same data) and high availability (if a server goes down, the remaining servers can still fulfil requests). When you create a replicaset, you need a minimum of 3 servers. There will always be a primary (read and write) and the remaining are called secondaries (for reading only).
MongoDB Atlas Cluster
Atlas is a DaaS, meaning a database a service company. They remove the burdain of maintaining, scaling and monitoring MongoDB servers on premise, so that you can focus on your applications.
An Atlas MongoDB cluster is a set of configuration you give to Atlas so it can setup the MongoDB servers for you. Hence, a MongoDB ReplicaSet is a feature subset in Atlas.
For example, while creating an Atlas Cluster, they will ask you whether you want a replicaset, sharded cluster, etc. Also, in which cloud provider you want to deploy. Your backup policy, the specs of your MongoDB hardware and more...
The keyword here is configuration. At the end of the day, you will have your MongoDB servers (replicaset or not) up and ready.
Summary
MongoDB Cluster
A specific configuration set of MongoDB servers to provide specific
features. i.e. replicaset and sharding.
MongoDB Replicaset
A MongoDB cluster setup to provide redundancy and high
availability with 3 or more odd number of servers (3, 5, 7, etc.)
MongoDB Atlas Cluster
High level MongoDB cluster configuration that allows you to set a
replicaset or other type of MongoDB cluster with its location and performance range.
I would suggest you to play with their web console. You will definitely see the difference.

mongoDB cluster structure and costs

I am new to mongoDB and I use kubernetes in order to provision mongoDB.
I have understood that the deployment of mongo is divided into config/mongos/mongo-data-nodes
each of the config and the data nodes requires 3 replicas.
data nodes will be divided into shards.
this will give me a very a large amount of running mongo instances as a big potion of these instances will not be functioning as they will just be replicas.
is there a way to run multiple replicas on the same instance?
is it recommended to run multiple replicas on the same instance?
how do you manage your mongoDB cluster in order to avoid reaching dozens of mongoDB instances where only a portion of them are actually usable?
any help would be appreciated

Can I do multi master replication for Mongo DB? Any reference architecture with Kubernetes is more expected in this question

I have a use case where we have a write and read intensive application using the MongoDB in backend. We are planning to implement federated K8s deployment for the Mongo DB with multi master architecture(How to do this?). I am looking for some suggestions on the architecture references/solutions if any that REALLY worked with Federation and active DB replication.
This doesn't directly answer you question but I know Kubedb does provide extensive database deployments within K8S. https://kubedb.com/docs/0.9.0/guides/mongodb/

MongoDB data replication in Kubernetes

I've been configuring pods in Kubernetes to hold a mongodb and golang image each with a service to load-balance. The major issue I am facing is data replication between databases. Replication controllers/replicasets do not seem to do what the name implies, but rather is a blank-slate copy instead of a replica of existing/currently running pods. I cannot seem to find any examples or clear answers on how Kubernetes addresses this, or does it even?
For example, data insertions being sent by the Go program are going to automatically load balance to one of X replicated instances of mongodb by the service. This poses problems since they will all be maintaining separate documents without any relation to one another once Kubernetes begins to balance the connections among other pods. Is there a way to address this in Kubernetes, or does it require a complete re-write of the Go code to expect data replication among numerous available databases?
Sorry, I'm relatively new to Kubernetes and couldn't seem to find much information regarding this.
You're right, a replica set is not a replica of another container, it's just a container with the same configuration spun up within the same logical unit.
A replica set (or deployment, which is the resource you should be using now) will have multiple pods, and it's up to you, the operator, to configure the mongodb part.
I would recommend reading this example of how to set up a replica set with multiple mongodb containers:
https://medium.com/google-cloud/mongodb-replica-sets-with-kubernetes-d96606bd9474#.e8y706grr