scalable meanjs on digitalocean - mongodb

I'm trying to learn a deployment process that can guarantee a headackeless scaling of a meanjs application (not in the level that people do it in big companies, but also not at a hobby level).
So as long as I understood, this could be a solution to work on:
Having mongodb on digitalocean on Ubuntu
Having the meanjs application (all other than mongodb) in a docker
Then one can scale! Because mongodb could be clustered separately and docker keeps the scaling of the application easy.
Well, I know it sounds trivial and that's why I'm asking here: I just want to go and learn docker and want to be sure before investing time on the above assumed solution!
Do you think this guarantee an easy scaling, say, for a simple online multiplayer game on meanjs please? Thank you.

UPDATE 31/07/2018
Digital Ocean introducing Kubernates which does all the orchestration they have also released load balancer which I think will work well with kubernates
==============
There is no off the shelf solution.
You can use docker with swamp but for small deployment it brings additional issues of monitoring and networking.
So here is what I did:
Create a script to generate HAProxy config when you start/stop instance
Have mongo in a cluster or replica or whatever. Database usually does not need to be scaled dynamically. You just have single mongo server then you scale it up and when you can't scale it vertically anymore you scale it horizontally by creating replica set and then scale it up until you can't then you do sharding.
So have HAProxy as load balancer that accepts connections on port 80 and forwards to your droples oven private network.
You can also write scripts to use DO API to create an image with your deployment and fire it up once you have more traffic either dynamically by detecting response time or cpu load or whatever other metric you have or statically.
I hope this helps.

Related

When to not use StatefulSets?

CONTEXT: I have been learning Kubernetes and trying to get some hands-on experience. I have been using AKS to abstract the complexity of having to deal with the control plane (and because I have a free student azure account). I am deploying a NodeJS app that connects to the MongoDB database. So far the deployment has been successful but I am using MongoDB Atlas and connecting to it.
Based on the little I have learned about Stateful sets, the MongoDB Atlas service seems a lot easier and more convenient but my question is, when would it be a better idea to consider deploying a stateful set with MongoDB database? (running on the pod) What's more cost-effective? More easily scalable?
I realize the questions might be a little bit vague but I am just getting started with Kubernetes..
disclaimer: This is not a production application, just something simple I am using to learn K8S
Official docs docs uses statefullset and that would make sense. Generally all DB kind of applications deployed as statefullset. Because there can be states that nodes are not sync with each other and that would create data inconsistencies between nodes(mongodb nodes not kubernetes).
You can deploy MongoDB as deployment. I have seen it deployed. But most clients use a connection string to connect(a string of multiple node addresses). And since kubernetes exposes statefullsets with headless services you should be okay.
For learning purpose, I advice you to deploy your MongoDB in a StatefulSet. Then you can learn how it works and what problem you could encounter with this Kubernetes object.
For production application, I advice to never deploy a database in a StatefulSet if you don't need it. In fact, StatefulSet will come with a lot of problematics that you might not need to manage.
Sometimes, companies rules restrict to host their data on external company storage.
To know if you need to put your database in a StatefulSet, the question I try to answer is:
Should my DB be hosted on premise (for privacy)?
Should my DB be scalable?
Should my DB be updated frequently?
You can find a list of pros/cons on the documentation.

Mongo database in GCP app engine

I'm currently looking into GCP app engine and I was figuring out how I would deploy a very large application with multiple services. I also wanted to use mongodb. GCP docs say that app engine allows dockerfiles and images. What would happen if I used the mongo docker image as a service on app engine? How would it scale it's instances? What will happen to consistency? I'm aware GCP have a third party solution for mongo, but since they allow docker images, what stops me from using it?
App Engine routinely tears down and creates new instances. If your instance is running MongoDB, then all the data stored in that instance will be lost.
This is why Google Cloud offers other, permanent places to store state, like Datastore and CloudSQL. You can also run MongoDb yourself on Google Compute Engine.
What would happen if I used the mongo docker image as a service on app engine?
Flexible App Engine allows you to use docker images to build your own application, as per is mentioned on this document [1]: "App Engine flexible environment instances are Compute Engine virtual machines, which means that you can take advantage of custom libraries, use SSH for debugging, and deploy your own Docker containers."
So there is no problem to use your own docker image in flexible app Engine.
How would it scale it's instances?
Each active version in App Engine must have at least one instance to handle requests, there are two ways to scale the instance in App Engine: automatic and manual.
As per is mentioned on the document[2]:
Automatic scaling creates instances based on request rate, response latencies, and other application metrics. You can specify thresholds for each of these metrics, as well as a minimum number instances to keep running at all times.
Manual scaling specifies the number of instances that continuously run regardless of the load level. This allows tasks such as complex initializations and applications that rely on the state of the memory over time.
The way you can configure these features is through the app.yaml file, I suggest you read this document[3]
What will happen to consistency?
Since App Engine scaling can be configured depending on its load, this allows for good performance in service execution and provides consistency in operations and optimization of resources.
[1] https://cloud.google.com/appengine/docs/flexible#features
[2] https://cloud.google.com/appengine/docs/flexible/go/how-instances-are-managed#instance_scaling
[3] https://cloud.google.com/appengine/docs/flexible/go/configuring-your-app-with-app-yaml

Best way to deploy MongoDB to Google Cloud Platform?

Been working on a web app with a simple database model that only needs CRUD operations, figured MongoDB would be perfect for it. The most important constraints of the project is that it be able to scale from a small amount of users to a large amount. I’ve been looking at the cloud launcher and I’ve noticed that the most popular MongoDB solution advertises a cost of ~$350/mo. This is a surprisingly large amount that makes me consider using cloud sql for my database instead. Is there a better way to deploy MongoDB to GCP that’s more fitted to my use case? I’ve been reading about automatic scaling with kubernetes but I can’t find anything about price. Any and all advice is greatly appreciated
I haven't used mongodb with kubernetes but we do use the cloud launcher solution at work. We use 2 nodes(n1-standard-1) and an arbiter(micro) + 100GB storage on each node which comes up around $100 a month. You would need a replicaset in a production environment so this seems to be a reasonable base cost.
Kubernetes does not provide a lot of advantages over the classic GCE deployment for mongodb compared to a webserver. Setting up a replicaset on kubernetes is a bit more work compared to GCE setup. https://medium.com/google-cloud/mongodb-replica-sets-with-kubernetes-d96606bd9474 and http://blog.kubernetes.io/2017/01/running-mongodb-on-kubernetes-with-statefulsets.html should serve as decent references but wouldn't lower your costs. Scaling nodes would be slightly easier though but does not strictly translate to scaling mongodb.
I have lately been working on a similar solution.
GCP announced that they don't charge for Kubernetes cluster management but only for resources used by it (instances, network ...):
https://cloud.google.com/kubernetes-engine/pricing
In general, databases are high maintenance (data mounts, backups, migrations...), so I would not start running Mongo on Kubernetes right away. You could get there but it will be more complicated than deploying your web app on Kubernetes.
Better to use MongoDB as a service that supports GCP (e.g. MongoDB Atlas), I have done so myself and see a few other companies do that.
If you scale gradually you should be able to control your costs.
The web app itself should be easy to deploy and maintain on Kubernetes.

Does it make sense Service Fabric in a single machine?

Service Fabric looks great but right now, I do not have enough demand to hire 5 machines (I think it is the minimum number of nodes of a cluster).
I was thinking to install Service Fabric SDK on a single Azure Virtual Machine.
I know that I will not have the main benefits of a service fabric application: reliability and scalability, but I will be developing in a framework that I can easily can hire more machines and to scale if it is necessary in the future without changing anything.
Right now, I have 15 microservices and I plan to add 10 more. At the present I am using IIS and deployment and maintenance is not too fast. It seems that Service Fabric could solve it, plus it would be easily scalabe
Does it make sense to use Service Fabric in a single machine? or better to keep under IIS.
Technically it is possible, though it doesn't make much sense. The one node cluster, runs with a special configuration and so, scale out of that cluster is not supported. You can use a single node cluster for testing and then create another one for production use.

Akka cluster and OpenShift

I'm new to Akka Clusters, however as I am understanding its documentation, I need to know at least one "seed node" to join an existing cluster.
So when using clusters with OpenShift I would need to know if the current gear is the first node - then I would create a new cluster - or if there are already some other gears around - I would need to know at least one of their IPs to join them.
Is this possible with OpenShift cloud? (I'm using the DIY catridge, so customizing the start up script wouldn't be a problem. However I can't find any environment variable which provides me relevant data.)
DIY gears on OpenShift Online do not scale. And if you are spinning up separate applications for each of the nodes in your cluster, you are going to (probably) run into inter-gear communication issues. You might need to create your own akka cartridge (http://docs.openshift.org/origin-m4/oo_cartridge_developers_guide.html), then you can set your own scaling options. You might check out this cartridge (https://github.com/smarterclayton/openshift-redis-cart) which supports scaling and might give you some ideas about how to implement yours.