Running two podman/docker containers of PostgreSQL on a single host - postgresql

I have two applications, each of which use several databases. Before the days of Docker, I would have just put all the databases on one host (due to resource consumption associated with running multiple physical hosts/VMs).
Logically, it seems to me that separating these into groups (1 group of DBs per application) is the right thing to do and with containers the overhead is low and this seems possible. However, I have not seen this use case. I've seen multiple instances of containerized Postgres running so as to maintain multiple versions (hence different images).
Is there a good technical reason why people do not do this (two or more containers of PostgreSQL instances using the same image for purposes of isolating groups of DBs)?
When I tried to do this, I ran into errors having to do with the second instance trying to configure the postgres user. I had to pass in an option to ignore migration errors. I'm wondering if there is a good reason not to do this.

Well, I am not used to work with prosgresql but with mysql, sqlite and ms sql - and docker, of course.
When I entered docker I used to read a lot about microservices, developing of these and, of course, the devops ideas behind docker and microsoervices.
In this world I would absolutly prefer to have 2 containers of the same base image with a multi stage build and / or different env-files to run you infrastructure. Docker is not only allowing this, it is prefering this.

Related

Postgres docker container vs. hosted solution

I'm used to building web apps the 'traditional' way but am trying to wrap my head around using docker. If I run postgres in a container along with my python web app, is this the same as spinning up a digital ocean server and installing postgres from scratch? How do I handle backups, fault tolerance, etc with a postgres database that is in docker?
As an alternative, I normally use hosted postgres on Heroku or AWS. Doesn't that solve a lot of the issues I would run into when hosting postgres myself in docker? Do developers really run postgres in docker or do they typically prefer to use an external hosted service?
It's wise for the moment to only keep stateless services or one-off jobs in Docker, and not put any stateful service, like a database.
This recent article from mesosphere has more details about why this isn't yet the case.
One issue would be that orchestration technologies aren't yet up to snuff for the high requirements of stateful services. To quote:
The first challenge is resource isolation. Many container orchestration solutions in the market provide a best effort approach to resource allocation, including memory, CPU and Storage. While this may be ok for stateless apps, it may be catastrophic for stateful services, where loss of performance may result in loss of customer transactions or data.
Another is that stateful databases have been built with different assumptions than those employed by containers, and are heavily optimized for them. Again, quoting:
Most of today’s stateful database technologies were originally designed for a non-containerized world. The operational instructions are very specific to the technology and can sometimes be version specific. Trying to map generic primitives of a container orchestration platform to stateful services is usually a time consuming and error prone operation.
You could totally run your postgres instance inside Docker. But it will require some work to handle backups, fault tolerance and such.
At my company we've made the choice to not put databases inside Docker, for now at least.

Which way to run PostgreSQL in Docker?

Which of these methods is correct?
One db container for each app
One db container for all apps
Install db without docker
I tried to find information, but nothing. Or I badly searched?
It is immature, but that doesn't seem to be stopping a lot of people using Docker for persistence.
The official Postgres image has 4.5 million pulls - Ok, this doesn't mean that all those images/containers are being used but it does suggest that it is a popular solution.
If you have already decided that you would like to use Docker, because of what containers can offer your architecture, then I don't think you will have trouble using it for persistence - assuming you are happy learning Docker.
I'm using Postgres and MySql in several projects quite successfully on Docker.
In choosing option 1 or 2, I would say that unless your apps are related to the same problem domain/company/project I would go with option 1. Of course, running costs will possibly factor in as well.
I generally go with option 1.
All 3 options could be valid but it depends on the use you have to perform.
In my server I've 1 container for every main postgresql releases currently I use.
I run all of them on different ports (use not random number for ports but some easy to remember because a problem with docker is remember all port numbers and other for every container).
pg84(port 8432), pg93(port 9332), pg94 (port 9432)
I link the pgXX to the container I need and that's perfect for me.
For my experience I prefer option 2 so.

In a containerized cluster, should mongodb servers be running on a worker or a core service?

I'm trying to implement an architecture that's similar to the coreos's production architecture (shown below)
Should I run the database as a central service or one or more of the workers?
I figured the database needs some kind of replication, which makes me think that putting it in the worker cluster makes more sense, but I'm just not sure.
This should be run as a worker. The central services are the basic things that come with CoreOS (mainly etcd). The workers host your applications, the database being one of them. You do have a persistence issue because your database will have state to remember between restarts. So, there is a bigger issue of how do you make that persistence? One was to do it is use a host file and give the database an affinity to that host and mount the host file. Another thing you might consider is running more than one database (if your db technology supports that) and replicate that database so you have two (or more) copies in different workers. (non-affinity). If your database creates transaction logs that can be applied to a backup, you can manage those transaction logs in a worker.
Another thing to consider is not using a container for your database. The database is a weird animal, its care and feeding is not like the rest of the applications. So it is reasonable (in my opinion) to have your database managed and maintained outside the scope of your cluster (but still reachable by the cluster).

Learning NOSQL databases using a single machine?

In relational databases I would just pop in W3Schools tutorial, install mysql in my machine and start practicing! How can I learn non relational databases in a similar way? In most tutorials I read that these databases work with multiple nodes and data centers.
Does this mean that I will be unable to learn and practice, say Cassandra, using my own single pc?
You do it just like you do it with mySQL: You set up a database on your local machine and start experimenting.
Most database systems which focus on sharding and clustering also work as a stand-alone instance. But when you want to test these features specifically, you can often run multiple instances on the same machine. When you also want to try how they behave when they run on different machines, you can use a virtualization software like VMWare or VirtualBox to set up a bunch of virtual machines and build your virtual datacenter on your desktop.
(I would recommend VMWare for business use and VirtualBox for home use)
I'm a big fan of MongoDB. It's the NoSQL equivalent of MySQL.
Go to the Try It Out link on their home page and you can actually use it in a live session on their website - no download, no configuration, no hassle! Just use it and learn the basics.
Here's the quick start for Cassandra. http://wiki.apache.org/cassandra/GettingStarted
I don't see any reason you couldnt run that from local host. I think the point is that you Can scale these nosql solutions. Might want to check out mongodb or couchdb as well. Easy set up and both are great nosql solutions in my experience.
I would strongly suggest using something like Amazon EC2 for testing NoSQL solutions. You can definitely install a technology like MongoDB locally and create a replica set, but you should definitely put these on different physical machines if you can.
I have installed things like AppFabric, Couchbase and Mongo locally and created clusters and they always work really well locally. It's very easy because the networking part of it always goes smoothly.
Once you introduce two physical machines and a stronger network partition things get difficult.
You can create instances on EC2 for free last I checked if you use their Micro instances. You'll learn a lot.

Couchbase as a memcached + repcached replacement?

I've got a group of servers that currently use both memcached and repcached side by side (listening on different ports). The memcached service is used to store local data that doesn't need to be shared. The repcached instance is used to allow pairs of servers to collaborate.
When I found Couchbase I was really excited because it looks like it would allow me to:
Make some data persistent
Share with more than two nodes
Leave most of my code as-is since it uses the memcached API
So I installed Couchbase but I've run into a problem--it doesn't look like there's a way to setup two clusters on the same server. I'd like one cluster that doesn't share with any other server and a second cluster that does share with other servers.
Yes, I could setup several dedicated servers for Couchbase to create different clusters but I've got plenty of CPU + ram to spare on the servers that are currently running memcached + repcached so I'd prefer to just replace those services with Couchbase.
Is it possible to run two instances of Couchbase on the same host? I realize I'd have to change some ports around. I just haven't seen anyone talking about doing anything like this so I'm thinking the answer is "no"... but I had to ask because it looks like Couchbase would be perfect for my needs.
If this won't work then I'd be interested in any alternative suggestions. For example, one idea I had was using Memcached + MemcacheDB to emulate a persistent non-shared Couchbase cluster. However, I don't like the fact that MemcacheDB doesn't support expiring records and I'd rather not have to write a routine to delete millions of records each month (and then wonder if performance will degrade over time).
Any thoughts would be appreciated. :-)
The best solution here is probably to run a single instance of Couchbase and create one memcached bucket and one Couchbase bucket. The memcached bucket won't have persistence and will function exactly like memcached. The other bucket will have persistence and supports the memcached api. You can create as many buckets as you want in a single Couchbase server.
Your other option is to virtualize and run a Couchbase server on each vm.