Everything inside one docker container or specialized containers - deployment

I have been reading about Docker for a long time and tried few examples. While going through blogs, I didn't find any unanimous answer as whether a product having multiple components like JEE deployables, database, webserver etc should be deployed, in one single container or on different containers. Can someone please provide detailed answer in terms of
Manageability
Complexity
Risk (like data loss, security etc.)
Any other point, welcome
Also will it be worth going Kubernetes route or Docker is still sufficient?

There are alternative views (that may work well depending on your use case) but the official docker stance is one process per container. From my experience, you'll be able to fit into the docker ecosystem and re-use things more effectively if you go with the grain on that one. However, there are alternative solutions (again, that may work well depending on your use case) such as passenger-docker.

Related

How can I compactly store a shared configuration with Kubernetes Kustomize?

First, I'm not sure this question is specific enough for Stack Overflow. Happy to remove or revise if someone has any suggestions.
We use Kubernetes to orchestrate our server side code, and have recently begun using Kustomize to modularize the code.
Most of our backend services fit nicely into that data model. For our main transactional system we have a base configuration that we overlay with tweaks for our development, staging, and different production flavors. This works really well and has helped us clean things up a ton.
We also use TensorFlow Serving to deploy machine learning models, each of which is trained and at this point deployed for each of our many clients. The only way that these configurations differ is in the name and metadata annotations (e.g., we might have one called classifier-acme and another one called classifier-bigcorp), and the bundle of weights that are pulled from our blob storage (e.g., one would pull from storage://models/acme/classifier and another would pull from storage://models/bigcorp/classifier). We also assign different namespaces to segregate between development, production, etc.
From what I understand of the Kustomize system, we would need to have a different base and set of overlays for every one of our customers if we wanted to encode the entire state of our current cluster in Kustomize files. This seems like a huge number of directories as we have many customers. If we have 100 customers and five different elopement environments, that's 500 directories with a kustomize.yml file.
Is there a tool or technique to encode this repeating with Kustomize? Or is there another tool that will work to help us generate Kubernetes configurations in a more systematic and compact way?
You can have more complex overlay structures than just a straight matrix approach. So like for one app have apps/foo-base and then apps/foo-dev and apps/foo-prod which both have ../foo-base in their bases and then those in turn are pulled in by the overlays/us-prod and overlays/eu-prod and whatnot.
But if every combo of customer and environment really does need its own setting then you might indeed end up with a lot of overlays.

Where is the best practice place to put kubernetes monitoring?

Is it best practice to place monitoring tools like Prometheus and Grafana inside a Kubernetes cluster or outside a Kubernetes cluster?
I can see the case for either. It seems very easy to place it inside the cluster. But that seems less robust.
It seems people do this typically, likely they are running everything in their environment or app under K8S. In a bigger picture view if you have use cases outside of one specific app it likely makes sense to run this on another architecture. The reason why is that Prometheus doesn't support clustering. You can write to two instances, but there is not really an HA plan for this product. To me, that's a major problem for a monitoring technology.
Most organizations who use this tool heavily end up not meeting use cases which APM (transaction tracing and diagnostics) can do. Additionally, you'll need to deploy an ELK/Splunk type stack, so it gets very complex. They also find it difficult to manage and often will look towards a Datadog, SingalFx, Sysdig, or another system which can do more and is fully managed. Naturally, most of what I have said here has cost, so if you do not have a budget then you'll have to spend your time (time=money) doing all the work.

Should actors/services be split into multiple projects?

I'm testing out Azure Service Fabric and started adding a lot of actors and services to the same project - is this okay to do or will I lose any of service fabric features as fail overs, scaleability etc?
My preference here is clearly 1 actor/1 service = 1 project. The big win with a platform like this is that it allows you to write proper microservice-oriented applications at close to no cost, at least compared to the implementation overhead you have when doing similar implementations on other, somewhat similar platforms.
I think it defies the point of an architecture like this to build services or actors that span multiple concerns. It makes sense (to me at least) to use these imaginary constraints to force you to keep the area of responsibility of these services as small as possible - and rather depend on/call other services in order to provide functionality outside of the responsibility of the project you are currently implementing.
In regards to scaling, it seems you'll still be able to scale your services/actors independently even though they are a part of the same project - at least that's implied by looking at the application manifest format. What you will not be able to do, though, are independent updates of services/actors within your project. As an example; if your project has two different actors, and you make a change to one of them, you will still need to deploy an update to both of them since they are part of the same code package and will share a version number.

How to setup a scalable environment for the MEAN stack on AWS?

I'm developing a web app on the MEAN stack (and probably other stuff, like some way of storing images).
In the startup accelerator I'm working in I've been suggested to let go the IAAS approach for a PAAS one (namely AWS). I must admit being used to working on small scale apps on single virtual machines I'm very confused about how to approach the task and the whole PAAS thing.
Reading through AWS documentation looks like Elastic Beanstalk is the way to go for building a scalable web app. From my understanding it abstracts away the infrastructure management workload taking care of load balancing and resource scaling.
I decided to give it a try.
Now I'm a bit confused on how to setup the infrastructure. Particularly I'm wondering how to fit MongoDB it the puzzle.
I guess I shouldn't install it on the node.js machine but on a different one, so that the two can scale out depending on needs.
My questions are:
where and how should I install Mongo?
should I let go MongoDB in favour of something like DynamoDB? in this case how can I set up a local development environment?
Any suggestions would be appreciated.

How hard is it to migrate a web app from localhost to a hosting platform?

Since I'm not a huge fan of any of the current solutions for managing the resources and knowledge that I have, I was thinking about making my own solution, which will involve custom code as well as possible integration of FOSS solutions. I would start development on my local machine, but if I like it, how difficult would it be to migrate over to a public server and let others also use this tool? What kinds of challenges might I be facing?
In theory, nothing, beyond just the process of moving stuff to the new machine. You can set up your own servers, on your own ports (port 80 for example).
You can even create your own fake domain at home, with just a tweak to the /etc/hosts files (or the equivalent on Windows).
Now, if you're developing on Windows and hosting on unix, you'll have platform issues, so I'd suggest against that, at least for a first project.
But other than that, it should be straightforward.
You didn't hard code any paths to "localhost" did you? If so, that should be the first thing to strip out. Either use relative paths, or have a configurable {AppPath} variable of some kind that you only need ever change once.
By the way, what language/framework are you using? it would help us provide sample code.
I would add that documentation is a highly important factor in any project if it is to be quickly embraced by the public. The tendency when developing in-house projects, especially if they are just for your own personal use, is to neglect, or even completely ignore documentation of all kinds, both of usage, as well as in the code. If users aren't told how to use the product, they wont use it, and if other potential developers don't know how or why things are done the way they are, or what the purpose of things are, they either won't bother with trying, or will cause other problems unintentionally.