When to not use StatefulSets? - mongodb

CONTEXT: I have been learning Kubernetes and trying to get some hands-on experience. I have been using AKS to abstract the complexity of having to deal with the control plane (and because I have a free student azure account). I am deploying a NodeJS app that connects to the MongoDB database. So far the deployment has been successful but I am using MongoDB Atlas and connecting to it.
Based on the little I have learned about Stateful sets, the MongoDB Atlas service seems a lot easier and more convenient but my question is, when would it be a better idea to consider deploying a stateful set with MongoDB database? (running on the pod) What's more cost-effective? More easily scalable?
I realize the questions might be a little bit vague but I am just getting started with Kubernetes..
disclaimer: This is not a production application, just something simple I am using to learn K8S

Official docs docs uses statefullset and that would make sense. Generally all DB kind of applications deployed as statefullset. Because there can be states that nodes are not sync with each other and that would create data inconsistencies between nodes(mongodb nodes not kubernetes).
You can deploy MongoDB as deployment. I have seen it deployed. But most clients use a connection string to connect(a string of multiple node addresses). And since kubernetes exposes statefullsets with headless services you should be okay.

For learning purpose, I advice you to deploy your MongoDB in a StatefulSet. Then you can learn how it works and what problem you could encounter with this Kubernetes object.
For production application, I advice to never deploy a database in a StatefulSet if you don't need it. In fact, StatefulSet will come with a lot of problematics that you might not need to manage.
Sometimes, companies rules restrict to host their data on external company storage.
To know if you need to put your database in a StatefulSet, the question I try to answer is:
Should my DB be hosted on premise (for privacy)?
Should my DB be scalable?
Should my DB be updated frequently?
You can find a list of pros/cons on the documentation.

Related

Mongodb replicaset with init scripts in docker-entrypoint-initdb.d

I'm working on trying to get a MongoDB replicaset deployed into Kubernetes with a default set of collections and data. The Kubernetes piece isn't too pertinent but I wanted to provide that for background.
Essentially in our environment we have a set of collections and data in the form of .js scripts that we currently build into our MongoDB image by copying them into /docker-entrypoint-initdb.d/. This works well in our current use case where we're only deploying MongoDB as a single container using Docker. Along with revamping our entire deployment process to deploy our application into Kubernetes, I need to get MongoDB deployed in a replicaset (with persistent storage) for obvious reasons such as failover.
The issue I've run into and found recognized elsewhere such as this issue https://github.com/docker-library/mongo/issues/339 is that scripts in /docker-entrypoint-initdb.d/ do not run in the same manner when configuring a replicaset. I've attempted a few other things such as running a seed container after the mongo replicaset is initialized, building our image with the collections and data on a different volume (such as /data/db2) so that it persists once the build is finished, and a variety of scripts such as those in the github link above. All of these either don't work or feel very "hacky" and I don't particularly feel comfortable deploying these to customer environments.
Unfortunately I'm a bit limited with toolsets and have not been approved to use a cloud offering like MongoDB Atlas or tooling such as the Enterprise Kubernetes Operator. Is there any real supported method for this use case or is the supported method to use a cloud offering or one of the MondoDB operators?
Thanks in advance!

Deploying Strapi on Kubernetes (GKE)

I want to deploy Strapi on GKE (Kubernetes), I have a docker-compose file, and I think I can use kcompose to create the deployment.
My questions is, has anyone used Mongodb Atlas + GKE or should I deploy Mongo on my own?
The question is more opinion based. It all depends on your needs.
If your needs match one of below you should stay with MongoDB:
Your app runs on-prem and contracts or privacy statements dont allow you to store data with a 3rd party.
You need large storage but not much query power.
There is other privacy/compliance issues.
Your app does not have internet access (firewalls, isolated environments)
You are running 3rd party applications that require a very old version of MongoDB
Here are some MongoDB Altas advantages:
Easily deploy, modify, and elastically scale their database clusters with a few clicks or an API call
Gain complete visibility into the performance the database and the underlying instances
Focus more on development, with built-in operational and security best practices such as geographically distributed, auto-healing clusters, and always-on authentication and encryption.
The best way would be if you will check how work with MongoDB Atlas on GCP looks alike. You can check this tutorial.

Can I do multi master replication for Mongo DB? Any reference architecture with Kubernetes is more expected in this question

I have a use case where we have a write and read intensive application using the MongoDB in backend. We are planning to implement federated K8s deployment for the Mongo DB with multi master architecture(How to do this?). I am looking for some suggestions on the architecture references/solutions if any that REALLY worked with Federation and active DB replication.
This doesn't directly answer you question but I know Kubedb does provide extensive database deployments within K8S. https://kubedb.com/docs/0.9.0/guides/mongodb/

Best way to deploy MongoDB to Google Cloud Platform?

Been working on a web app with a simple database model that only needs CRUD operations, figured MongoDB would be perfect for it. The most important constraints of the project is that it be able to scale from a small amount of users to a large amount. I’ve been looking at the cloud launcher and I’ve noticed that the most popular MongoDB solution advertises a cost of ~$350/mo. This is a surprisingly large amount that makes me consider using cloud sql for my database instead. Is there a better way to deploy MongoDB to GCP that’s more fitted to my use case? I’ve been reading about automatic scaling with kubernetes but I can’t find anything about price. Any and all advice is greatly appreciated
I haven't used mongodb with kubernetes but we do use the cloud launcher solution at work. We use 2 nodes(n1-standard-1) and an arbiter(micro) + 100GB storage on each node which comes up around $100 a month. You would need a replicaset in a production environment so this seems to be a reasonable base cost.
Kubernetes does not provide a lot of advantages over the classic GCE deployment for mongodb compared to a webserver. Setting up a replicaset on kubernetes is a bit more work compared to GCE setup. https://medium.com/google-cloud/mongodb-replica-sets-with-kubernetes-d96606bd9474 and http://blog.kubernetes.io/2017/01/running-mongodb-on-kubernetes-with-statefulsets.html should serve as decent references but wouldn't lower your costs. Scaling nodes would be slightly easier though but does not strictly translate to scaling mongodb.
I have lately been working on a similar solution.
GCP announced that they don't charge for Kubernetes cluster management but only for resources used by it (instances, network ...):
https://cloud.google.com/kubernetes-engine/pricing
In general, databases are high maintenance (data mounts, backups, migrations...), so I would not start running Mongo on Kubernetes right away. You could get there but it will be more complicated than deploying your web app on Kubernetes.
Better to use MongoDB as a service that supports GCP (e.g. MongoDB Atlas), I have done so myself and see a few other companies do that.
If you scale gradually you should be able to control your costs.
The web app itself should be easy to deploy and maintain on Kubernetes.

MongoDB on Azure worker role

I m developing an application using SignalR to manage websockets and allow my clients to dialog between each other.
I m planning to host this back-office on an Azure worker role. As my SignalR requests carry data that is most of the time saved in the database, I m wondering if NoSQL's MongoDB instead of the classic SQL Server/Entity Framework couple should be a good approach.
Assuming that my application's data types will be strings for most of them, I think MongoDB will be a reliable and a performant solution, and it will allow me to get rid of Azure's SQL's database costs.
For information, the Azure worker role will be running on a machine with the following hardware: 1 core CPU, 3.5GB RAM and 50GB SSD storage.
Do you think I m on a good start with this architecture ?
Thanks
Do you think I m on a good start with this architecture?
In a word, no.
A user asked a similar question regarding running Redis on Worker Roles - Setting up Redis on Azure cloud service worker role - all of the content on that Q/A is relevant in the MongoDb context.
I'd suggest that you read my answer as it goes into more detail, but as an overview of why this is a bad architectural approach:
You cannot guarantee when a Worker Role will be restarted by the Azure Service Fabric.
In a real-world implementation of Mongo, you would run multiple nodes within a cluster, with a single Worker Role (as you have suggested in your question) this won't be possible.
You will need to manage your MongoDb installation within the Worker Role and they simply aren't designed for this.
If you are really fixed on using Mongo, I would suggest that you use a hosted solution such as MongoLabs (as suggested in earlier answers), or consider hosting it on Azure IaaS VM's.
If you are not fixed on using Mongo, I would sincerely suggest that you look at Azure DocumentDb (also suggested above), Microsoft's Azure NoSQL offering - I have used it in several production systems already and it is certainly a capable NoSQL solution; granted, it may not have all of the features available with MongoDb.
If you are looking at a NoSQL solution for caching of data (i.e. not long term storage), I would suggest you take a look at Azure Redis Cache, which is a very capable Redis offering.
Azure has its own native NoSQL Document database called DocumentDB, have you had a look at it? If I were you I would use DocumentDB unless there are some special requirements that you have that you have not mentioned, but from what little requirement info that you have posted DocumentDB would do just fine. I don't think that it is quite similar to MongoDB in terms of the basic functionality, see this article for a comparison between Azure DocumentDB and MongoDB.