Designing system architecture with Docker containers

Designing system architecture with Docker containers - mongodb

I am new to Docker. I want some opinion from some expert about container design. I have set up a database in the MongoDB cloud (Atlas). I have Windows app in Docker container which include Windows OS and application based components. I want to use RavenDB and this database is very new to me. A component of my Windows container will communicate to both MongoDB and RavenDB.
My question is
should I create different docker container for RavenDB or will I install RavenDB in my existing windows container.
it is design decision problem. I am new to RavenDB and Docker so the pros and cons are not clear to me yet. Kindly help me.

I had a similar application, where I had a postgresql db and Nodejs webapp.
The web application and the database were running on separate docker containers.
This way the two containers were independent of each other.
This replicates the actual production scenario, where you'll have your service and database running separately.
It is recommended to run single process on each container.
Better modularity of the services. Separation of Concerns.
Scaling containers horizontally is much easier if the container is isolated to a single function.
This way the two containers were independent of each other. The postgresql db container had a volume mounted to persist the data.
A more detailed explanation can be found here

Related

Mongodb replicaset with init scripts in docker-entrypoint-initdb.d

I'm working on trying to get a MongoDB replicaset deployed into Kubernetes with a default set of collections and data. The Kubernetes piece isn't too pertinent but I wanted to provide that for background.
Essentially in our environment we have a set of collections and data in the form of .js scripts that we currently build into our MongoDB image by copying them into /docker-entrypoint-initdb.d/. This works well in our current use case where we're only deploying MongoDB as a single container using Docker. Along with revamping our entire deployment process to deploy our application into Kubernetes, I need to get MongoDB deployed in a replicaset (with persistent storage) for obvious reasons such as failover.
The issue I've run into and found recognized elsewhere such as this issue https://github.com/docker-library/mongo/issues/339 is that scripts in /docker-entrypoint-initdb.d/ do not run in the same manner when configuring a replicaset. I've attempted a few other things such as running a seed container after the mongo replicaset is initialized, building our image with the collections and data on a different volume (such as /data/db2) so that it persists once the build is finished, and a variety of scripts such as those in the github link above. All of these either don't work or feel very "hacky" and I don't particularly feel comfortable deploying these to customer environments.
Unfortunately I'm a bit limited with toolsets and have not been approved to use a cloud offering like MongoDB Atlas or tooling such as the Enterprise Kubernetes Operator. Is there any real supported method for this use case or is the supported method to use a cloud offering or one of the MondoDB operators?
Thanks in advance!

Postgres docker container vs. hosted solution

I'm used to building web apps the 'traditional' way but am trying to wrap my head around using docker. If I run postgres in a container along with my python web app, is this the same as spinning up a digital ocean server and installing postgres from scratch? How do I handle backups, fault tolerance, etc with a postgres database that is in docker?
As an alternative, I normally use hosted postgres on Heroku or AWS. Doesn't that solve a lot of the issues I would run into when hosting postgres myself in docker? Do developers really run postgres in docker or do they typically prefer to use an external hosted service?

It's wise for the moment to only keep stateless services or one-off jobs in Docker, and not put any stateful service, like a database.
This recent article from mesosphere has more details about why this isn't yet the case.
One issue would be that orchestration technologies aren't yet up to snuff for the high requirements of stateful services. To quote:
The first challenge is resource isolation. Many container orchestration solutions in the market provide a best effort approach to resource allocation, including memory, CPU and Storage. While this may be ok for stateless apps, it may be catastrophic for stateful services, where loss of performance may result in loss of customer transactions or data.
Another is that stateful databases have been built with different assumptions than those employed by containers, and are heavily optimized for them. Again, quoting:
Most of today’s stateful database technologies were originally designed for a non-containerized world. The operational instructions are very specific to the technology and can sometimes be version specific. Trying to map generic primitives of a container orchestration platform to stateful services is usually a time consuming and error prone operation.

You could totally run your postgres instance inside Docker. But it will require some work to handle backups, fault tolerance and such.
At my company we've made the choice to not put databases inside Docker, for now at least.

Separate Dev and Production instances and database

I have a web application hosted on a server, it uses virtualEnv to separate dev and prod instances. Both instances share the same postgres database. (all on the same server)
I am kind of new to docker and I would like to replace the dev and prod instances with docker containers, and each link to its dev or prod postgres containers (or a similar effect so that a code change in development will not affect production database.)
what is the best design for this scenario? should I have the dev and prod container mapped to different ports? Can I have 1 dockerfile for both dev and prod containers? How do I deal with 2 postgres container?

Seems your requirement is not very complicated, so I think you can run 2 pair containers (each pair have one application container and one postgres container) to achieve this, the basic structure described as below:
devContainer---> pgsDBContainer:<port_number1> ---> dataVolume1
prodContainer---> pgsDBContainer:<port_number2> ---> dataVolume2
Each container pair have one dedicated port number and one dedicated volume. The port numbers used for dev or prod application to connect to corresponding postgres database, which should be easy to understand. But volume is another story.
Please read this Manage data in containers doc for container volume. As you mentioned that "a code change in development will not affect production database", which means you should have two separate volumes for postgres containers, so the data of the databases will not mixed up.
Can I have 1 dockerfile for both dev and prod containers?
Yes you can, just as I mentioned, you should give each postgres container different port and volume config when you start them with docker run command. docker run has EXPOSE option and VOLUME option for you to config the port number and volume location.
Just a reminder, When you run a database in container, you may need to consider the Data Persistent in container environment to avoid data loss that caused by container removed. Some discussions of container Data Persistent can be found here and there.

In a containerized cluster, should mongodb servers be running on a worker or a core service?

I'm trying to implement an architecture that's similar to the coreos's production architecture (shown below)
Should I run the database as a central service or one or more of the workers?
I figured the database needs some kind of replication, which makes me think that putting it in the worker cluster makes more sense, but I'm just not sure.

This should be run as a worker. The central services are the basic things that come with CoreOS (mainly etcd). The workers host your applications, the database being one of them. You do have a persistence issue because your database will have state to remember between restarts. So, there is a bigger issue of how do you make that persistence? One was to do it is use a host file and give the database an affinity to that host and mount the host file. Another thing you might consider is running more than one database (if your db technology supports that) and replicate that database so you have two (or more) copies in different workers. (non-affinity). If your database creates transaction logs that can be applied to a backup, you can manage those transaction logs in a worker.
Another thing to consider is not using a container for your database. The database is a weird animal, its care and feeding is not like the rest of the applications. So it is reasonable (in my opinion) to have your database managed and maintained outside the scope of your cluster (but still reachable by the cluster).

Docker: Running multiple applications VS running multiple containers

I am trying to run Wildfly, Jenkins and Postgresql in Docker container(s).
As far as I could understand from articles I've read, the Docker way is to have each application run in a different container.
Is my assumption correct or is it better to have only one container containing these three applications?

Afaik the basic philosophy behind docker is to run one service per container. You can run whole application inside a container, but I don't think that will go well with the way docker work. Running different services in different containers gives you more flexibility and a better modularity for your app.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse