Postgresql logical replication Docker container - postgresql

I'am actually working on the logical replication between my master on Windows and slave on Linux.
I want to transfer all my datas to my slave with logical replication Postgresql.
My postgres on Linuw will be working as docker container.
During the replication, when my container restarted, I have the impression that data were streaming are duplicated, for one table on master with 2,5 Gb of size, I find 5,3 Gb on my slave.
Does it possible that data are duplicated ? There is a way to continue the streaming normally even if docker container restart ?
Thanks a lot

Related

Can we do streaming and logical replication in postgresql Slave at same time?

We have PostgresSQL cluster with 1 master and 2 slave configuration , we want to enable logical replication from Slave , as the master a pretty work heavy database , we don’t want to put more load on DB , Is it possible if we can start both type of replication from any slave currently replicating through streaming replication
we have setup with master it worked , not sure with slave , we did few test run in older vesrion which failed
No, the logical replication primary cannot be a streaming replication standby server.

Are there any quick ways to move PostgreSQL database between clusters on the same server?

We have two big databases (200GB and 330GB) in our "9.6 main" PostgreSQL cluster.
What if we create another cluster (instance) on the same server, is there any way to quickly move database files to new cluster's folder?
Without using pg_dump and pg_restore, with minimum downtime.
We want to be able to replicate the 200GB database to another server without pumping all 530GB of data.
Databases aren't portable, so the only way to move them to another cluster is to use pg_dump (which I'm aware you want to avoid), or use logical replication to copy it to another cluster. You would just need to set wal_level to 'logical' in postgresql.conf, and create a publication that included all tables.
CREATE PUBLICATION my_pub FOR ALL TABLES;
Then, on your new cluster, you'd create a subscription:
CREATE SUBSCRIPTION my_sub
CONNECTION 'host=172.100.100.1 port=5432 dbname=postgres'
PUBLICATION my_pub;
More information on this is available in the PostgreSQL documentation: https://www.postgresql.org/docs/current/logical-replication.html
TL;DR: no.
PosgreSQL itself does not allow to move all data files from a single database from one source PG cluster to another target PG cluster, whether the cluster runs on the same machine or on another machine. To this respect it is less flexible than Oracle transportable tablespaces or SQL Server attach/detach database commands for example.
The usual way to clone a PG cluster is to use streaming physical replication to build a physical standby cluster of all databases but this requires to backup and restore all databases with pg_basebackup (physical backup): it can be slow depending on the databases size but once the standby cluster is synchronized it should be really fast to failover to standby cluster by promoting it; miminal downtime is possible. After promotion you can drop the database not needed.
However it may be possible to use storage snaphots to copy quickly all data files from one source cluster to another cluster (and then drop the database not needed in the target cluster). But I have not practiced it and it does not seem to be really used (except maybe in some managed services in the cloud).
(PG cluster means PG instance).
If You would like to avoid pg_dump/pg_restore, than use:
logical replication (enables to replicate only desired databases)
streaming replication via replication slot (moving the whole cluster
to another and then drop undesired databases)
While 1. option is described above, I will briefly describe the 2.:
a) create role with replication privileges on master (cluster I want to copy from)
master# psql> CREATE USER replikator WITH REPLICATION ENCRYPTED PASSWORD 'replikator123';
b) log to slave cluster and switch to postgres user. Stop postgresql instance and delete DB data files. Then You will initiate replication from slave (watch versions and dirs!):
pg_basebackup -h MASTER_IP -U replikator -D /var/lib/pgsql/11/data -r 50M -R –waldir /var/lib/pgwal/11/pg_wal -X stream -c fast -C -S master1_to_slave1 -v -P
What this command do? It connects to master with replikator credentials and start pg_basebackup via slot that will be created. There is bandwith throttling as well (50M) as other options... Right after the basebackup slave will start streaming replication and You've got failsafe replication.
c) Then when You want, promote slave to be standalone and delete undesired databases:
rm -f /varlib/pgsql/11/data/recovery.conf
systemctl restart postgresql11.service

Doing a postgresql master-slave on a single machine

I know the concept of the master slave but my question is will it speed up my write and reads if I just set those master and slaves in a single server? Like let us say I set it up in a Docker Daemon, will the read and write be faster when I set it up with two containers? Or will it just be the same with a single postgres container since it will still just use the same linux kernel?

Postgres master / slave based on table

Currently I have 1 postgres instance which is starting to receive too much load and want create a cluster of 2 postgres nodes.
From reading the documentation for postgres and pgpool, it seems like I can only write to a master and read from a slave or run parallel queries.
What I'm looking for is a simple replication of a database but with master/slave based on which table is being updated. Is this possible? Am i missing it somewhere in the documentation?
e.g.
update users will be executed on server1 and replicated to server2
update big_table will be executed on server2 and replicated back to server1
What you are looking for is called MASTER/MASTER replication. This is supported natively (without PgPool) since 9.5. Note, that it's an "eventually consistent" architecture, so your application should be aware of possible temporary differences between the two servers.
See PG documentation for more details and setup instructions.

Primary and standby server at different timelines in postgres

I am very new to postgres and being new I got stuck at a point and need some help, please pardon if you find it silly.
I am doing a pgpool HA and at postgres level i have streaming replication between 3 nodes of postgresql-9.5 - 1 master and 2 slaves
I was trying to configure auto failover but when i switched back to my original master, and restarted the postgres service, I am getting the following error:
slave 1-highest timeline 1 of the primary is behind recovery timeline 11
slave 2-highest timeline 1 of the primary is behind recovery timeline 10
slave 3-highest timeline 1 of the primary is behind recovery timeline 3
I tried deleting pg_xlog files in slaves and copying all the files from master pg_xlog into the slaves and then did a rsync.
i also did a pg_rewind but it says:
target server needs to use either data checksums or wal_log_hints = on
(I have wal_log_hints = on set in postgresql.conf already)
I've tried doing a pg_basebackup but since the data base server in slaves are still starting up its not able to connect to the server
Is there any way to bring the master and the slave at a same timeline?
In my case, it happened because ( experimentally ), I updated the standby database tables and again when I simulate the master-standby streaming replication I got the same errors.
So once again I cleaned the whole standby database directory and migrate the master database using cmd like
"pg_basebackup -P -R -X stream -c fast -h 10.10.40.105 -U postgres -D standby/"
I think something is wrong in your pgpool configuration. What tool you have been using for manement of replication and master-slave control? Is it post master or repmgr?
I was trying to configure pgpool with 3 data nodes using a tutorial from http://jensd.be/591/linux/setup-a-redundant-postgresql-database-with-repmgr-and-pgpool and have done it correctly.
Also you can lean auto failover here.
(These question is obviously duplicate of this one, so I'll repeat the answer also.)
I'm not sure what you exactly mean by "when i switched back to my original master", but it looks that you are doing the wrongest possible thing in PostgreSQL streaming replication - introducing the second master.
The most important thing you should know about PostgreSQL replication is that once the failover is performed, you cannot simply "switch back to original master" - there's now a new master in cluster, and existence of two masters will make damage.
After a slave is promoted to master, the only way for you to re-join the old master is to:
Destroy it (delete the data directory);
Join it as a slave.
If you want it to be master again you'll continue with the following:
Let it run for awhile as a slave so that it can sync the data;
Kill temporary master and failover to old master;
Rejoin temporary master again as a slave.
You cannot simply switch master servers! Master can be created ONLY by failover (promoting a slave)
You should also know that whenever you are performing failover (whenever the master is changed), all slaves (except for the one that is promoted) need to be reconfigured to target the new master.
I suggest you reading this tutorial - it'll help.