Use Barman within a Docker swarm

Use Barman within a Docker swarm - postgresql

I'm almost new to the PostgreSQL technology but today, I have a Docker swarm with two nodes dedicated for the databases. Consider that the first node contains the "master" PostgreSQL (> 9.3) and the second one is just a replica of the master.
This backup method doesn't suit me anymore since my cluster could grow up soon and I would need point in time recovery. In did some research and that's why I would like to use Barman (http://www.pgbarman.org/).
I was wondering if Barman can be efficient in a Docker swarm and if some of you had a return of experience on it.
I'm trying at the moment to build this infrastructure. I'll post updates if needed.
Additionnaly, more documentation on the topic could help me :)
Thanks, Paul

After some days practicing barman and reading the documentation, I think Barman is a suitable and really good option for backups in a Docker Swarm since it allows streaming backup.
You can choose to backup every minutes a small amount of data. It's light for Docker.
If there is a disaster and your PostgreSQL service comes down you can still recover from a recent backup.

Related

Is it possible to upgrade postgres 9.6 to 10 with zero downtime. What are the approaches that can be followed?

I'm trying to upgrade my postgres database from 9.6 to 10 without any downtime. Can this be done?

As "a_horse_with_no_name" mentioned logical replication is a very good choice in your situation.
Only problem is 9.6 does not have internal implementation yet so you would have to use extension "pglogical" on both DB - here I found some description - https://rosenfeld.herokuapp.com/en/articles/infrastructure/2017-11-10-upgrading-postgresql-from-9-6-to-10-with-minimal-downtime-using-pglogical - skip parts about Docker and see how pglogical works.
The only slight problem is pglogical must be added into "shared_preload_libraries" parameter and postgresql service must be restarted which can be sometimes difficult on production...
I did quite a lot of tests with pglogical (here are some notes - http://postgresql.freeideas.cz/pglogical-postgresql-9-6-small-hints-debian/) although at the end I never used pglogical on production. So I do not have experiences from long usage.
But I presume some problems can be similar to the internal implementation of logical replication in PG 10 and 11. So here are my notes from my current usage of internal logical replication on PG 11 - http://postgresql.freeideas.cz/setting-logical-replication-is-not-entirely-straight-forward/ - maybe something from it would help you.
My recommendation for you would be:
make a hot backup copy of your PG 9.6 database on some other machine on cloud VM with exactly the same OS and if possible disks types and configuration using pg_basebackup - you can find some inspiration here:
http://postgresql.freeideas.cz/pg_basebackup-bash-script-backup-archiving-google-storage/
http://postgresql.freeideas.cz/pg_basebackup-experiences/
or if you already use pg_basebackup for tar backups of your db restore latest backup on other machine or VM (http://postgresql.freeideas.cz/pg_basebackup-pgbarman-restore-tar-backup/)
start is as normal server (not as hot standby) and test pglogical on this copy of your DB against some testing installation of PG 10 - test it as close to the production environment as possible including at least simulated DML operations of similar intensity - this will show you differences in load on machine / VM.
I highly recommend to set monitoring for example using telegraf + influxdb + Grafana (easiest implementation by my opinion) to be able to analyze CPU and memory usage later - this can be very crucial part for usage on production !
after hopefully short and successful tests implement it and celebrate your success :-) and please write about you experiences. Because I believe a lot of people would welcome it.

Postgres 9.6 replication from production to custom slave

I have a problem. Currently I have a 1 TB Postgres 9.6 database which is backed up with Barman with streaming.
What I need:
A replication from the production/master to the slave server:
On which I can write, I don’t care if the written data on the replica
is not sent to the master server
Which can be configured almost in real time or with little delay
On which I can use dump without locking the master database
As said above I am using Barman for backing up. However I am not able to find out how I can build a replica from Barman which is sync by the master. It was set up by a someone else and i'm not sure its the right solution for what I need.
My questions:
Is Barman the good tool for what I want ?
If no. Which tools would you suggest to me ?
If yes. Do you know how to build replica from Barman which is
sync by the master ? Could you please explain to me how to do
it?
Thanks

in master-slave mode, you can't write on slave
if you want to write on replica to you should probably use something like this
also you can make sure all of your writes on master also written on replica via synchronous-wal-streaming feature
via this feature, before wiritng on master , first master makes sure write was written successfully on replica
except for writing on slave part , barman looks a fit tool for you
writing on slave is a uncommon thing in postgresql

What's a good way to backup a (AWS) Postgres DB

what's a good way to backup a Postgres DB (running on Amazon RDS).
The built in snapshoting from RDS is by default daily and you can not export the snapshots. Besides that, it can take quite a long time to import a snapshot.
Is there a good service that takes dumps on a regular basis and stores them on e.g. S3? We don't want to spin up and maintain a ec2 instance which does that.
Thank you!

I want the backups to be automated, so I would prefer to have dedicated service for that.
Your choices:
run pg_dump from an EC2 instance on a schedule. This is a great use case for Spot instances.
restore a snapshot to a new RDS instance, then run pg_dump as above. This reduces database load.
Want to run a RDS snapshot more often than daily? Kick it off manually.
These are all automateable. For "free" (low effort on your part) you get daily snapshots. I agree, I wish they could be sent to S3.

SOLUTION: Now you can do a pg_dumpall and dump all Postgres databases on a single AWS RDS Instance.
It has caveats and so its better to read the post before going ahead and compiling your own version of pg_dumpall for this. Details here.

Migrate database from Heroku to AWS

I want to migrate our postgres db from heroku to our own postgres on AWS.
I have tried using pg_dump and pg_restore to do the migration and it works; but it takes a really long time to do this. Our database size is around 20GB.
What's the best way to do the migration with minimal downtime?

If you mean AWS RDS PostgreSQL:
pg_dump and pg_restore
I know you don't like it, but you don't really have other options. With a lot of hoop jumping you might be able to do it with Londiste or Slony-I on a nearby EC2 instance, but it'd be ... interesting. That's not the most friendly way to do an upgrade, to say the least.
What you should be able to do is ship WAL into RDS PostgreSQL, and/or stream replication logs. However Amazon don't support this.
Hopefully Amazon will adopt some part of 9.4's logical replication and logical changeset extraction features, or better yet the BDR project - but I wouldn't hold my breath.
If you mean AWS EC2
If you're running your own EC2 instance with Pg, use replication then promote the standby into the new master.

Mongodb EC2 EBS Backups

I have confusion on what I need to do here. I am new to Mongo. I have set up a small Mongo server on Amazon EC2, with EBS volumes, one for data, one for logs. I need to do a backup. It's okay to take the DB down in the middle of the night, at least currently.
Using the boto library, EBS snapshots and python to do the backup, I built a simple script that does the following:
sudo service mongodb stop
run backup of data
run backup of logs
sudo service mongodb start
The script ran through and restarted, but I noted in the AWS console that the snapshots are still being created, even through boto has come back, but Mongo has restarted. Certainly not ideal.
I checked the Mongo docs, and found this explanation on what to do for backups:
http://docs.mongodb.org/ecosystem/tutorial/backup-and-restore-mongodb-on-amazon-ec2/#ec2-backup-database-files
This is good info, but a bit unclear. If you are using journaling, which we are, it says:
If the dbpath is mapped to a single EBS volume then proceed to Backup the Database Files.
We have a single volume for data. So, I'm assuming that means to bypass the steps on flushing and locking. But at the end of Backup the Database Files, it discusses removing the locks.
So, I'm a bit confused. As I read it originally, then I don't actually need to do anything - I can just run the backup, and not worry about flushing/locking period. I probably don't need to take the DB down. But the paranoid part of me says no, that sounds suspicious.
Any thoughts from anyone on this, or experience, or good old fashioned knowledge?

Since you are using journaling, you can just run the snapshot without taking the DB down. This will be fine as long as the journal files are on the same EBS volume, which they would be unless you symlink them elsewhere.
We run a lot of mongodb servers on Amazon and this is how we do it too.