I am trying to set logical replication between 2 cloud instances both with Debian 9 and PG 11.1. The command CREATE PUBLICATION on master was successful, but when I start the command CREATE SUBSCRIPTION on the intended logical replica, the command hangs indefinitely.
On the master I can see that the replication slot was created and is active and I can see a new walsender process created and "waiting" and in the log on the master I see these these lines:
2019-01-14 14:20:39.924 UTC [8349] repl_user#db LOG: logical decoding found initial starting point at 7B0/6C777D10
2019-01-14 14:20:39.924 UTC [8349] repl_user#db DETAIL: Waiting for transactions (approximately 2) older than 827339177 to end.
But that is all. The command CREATE SUBSCRIPTION never ends.
Master is a db with heavy inserts, like 100s per minute, but they are all always committed. So there should not be any long time uncommitted transactions.
I tried to google for this problem but did not find anything. What am I missing?
Since the databases are “in the cloud”, you don't know where they really are.
Odds are that they are actually in the same database cluster, which would explain the deadlock you see: CREATE SUBSCRIPTION waits until all concurrent transactions on the cluster that contains the replication source database are finished before it can create its replication slot, but since both databases are in the same cluster, it waits for itself to finish, which obviously won't happen.
The solution is to explicitly create a logical replication slot in the source database and to use that existing slot when you create the subscription.
Related
I have set up Postgres 11 streaming replication cluster. Standby is a "hot standby". Is it possible to attach the second standby as a warm standby?
I assume that you are talking about WAL file shipping when you are speaking of a “warm standby”.
Sure, there is nothing that keeps you from adding a second standby that ships WAL files rather than directly attaching to the primary, but I don't see the reason for that.
According to this decent documentation of Postgres 11 streaming replication architecture, you can set the sync_state of a 2nd slave instance to be potential. This means that if/when the 1st sync slave fails, the detected failure (through ACK communication) will result in the 2nd slave will move from potential to sync becoming the active replication server. --see Section 11.3 - Managing Multiple Stand-by Servers in that link for more details.
How to stop collection cloned even when all the data is restored on recovered instance using oplog replay.
Replication Scenario
I have a 3 node Replcation set up.
Load
There is continous load, data keeps adding every day.
and we have oplog backups every 2 housrs. Inspite of the oplog backups set for ever 2 hours,
we have some of transactions roll off from the oplog. That means we might be miss some records when we replay those oplogs.
Scenario.
In a replication scenario we have one of the secondaries not responding and by the time we join it back to the replication set
the minimum oplog timestamp goes past the oplog in the failed instance and the failed instance tries to catch up but it gets into a recovering mode.
from the
log message on the recovering instance.
2019-02-13T15:49:42.346-0500 I REPL [replication-0] We are too stale to use primaryserver3:27012 as a sync source. Blacklisting this sync source because our last fetched timestamp: Timestamp(1550090168, 1) is before their earliest timestamp: Timestamp(1550090897, 28907) for 1min until: 2019-02-13T15:50:42.346-0500
2019-02-13T15:49:42.347-0500 I REPL [replication-0] sync source candidate: primaryserver3:27012
2019-02-13T15:49:42.347-0500 I ASIO [RS] Connecting to primaryserver3:27012
2019-02-13T15:49:42.348-0500 I REPL [replication-0] We are too stale to use primaryserver3:27012 as a sync source. Blacklisting this sync source because our last fetched timestamp: Timestamp(1550090168, 1) is before their earliest timestamp: Timestamp(1550090897, 22809) for 1min until: 2019-02-13T15:50:42.348-0500
2019-02-13T15:49:42.348-0500 I REPL [replication-0] could not find member to sync from
To bring this instance at par with Primary, We make this "RECOVERING" instance as a "new PRIMARY" and apply all oplog backups taken till present insert. after the oplogs are applied Record count on both the servers match. Now when i join the recovering instance(ie "new PRIMARY") back to the replication set,
i see the logs showing "initial sync" which is supposed to do and then seeing the below log
2019-03-01T12:11:58.327-0500 I REPL [repl writer worker 4] CollectionCloner ns:datagen_it_test.test finished cloning with status: OK
2019-03-01T12:12:40.518-0500 I REPL [repl writer worker 8] CollectionCloner ns:datagen_it_test.link finished cloning with status: OK
Where the collections are cloned again.
My question is Why does it clone again to get the data. We have the data restored in the "recovering" instance records all match.
How to stop the cloning happening.
As per the MongoDB documentation
A replica set member becomes “stale” when its replication process
falls so far behind that the primary overwrites oplog entries the
member has not yet replicated. The member cannot catch up and becomes
“stale.” When this occurs, you must completely resynchronize the
member by removing its data and performing an initial sync.
This tutorial addresses both resyncing a stale member and creating a
new member using seed data from another member, both of which can be
used to restore a replica set member. When syncing a member, choose a
time when the system has the bandwidth to move a large amount of data.
Schedule the synchronization during a time of low usage or during a
maintenance window.
MongoDB provides two options for performing an initial sync:
Restart the mongod with an empty data directory and let MongoDB’s
normal initial syncing feature restore the data. This is the more
simple option but may take longer to replace the data.
See Automatically Sync a Member.
Restart the machine with a copy of a recent data directory from
another member in the replica set. This procedure can replace the data
more quickly but requires more manual steps.
See Sync by Copying Data Files from Another Member.
Step by step procedure is available in
Resync a Member of a Replica Set
I am testing logical replication between 2 PostgreSQL 11 databases for use on our production (I was able to set it thanks to this answer - PostgreSQL logical replication - create subscription hangs) and it worked well.
Now I am testing scripts and procedure which would set it automatically on production databases but I am facing strange problem with logical replication slots.
I had to restart logical replica due to some changes in setting requiring restart - which of course could happen on replicas also in the future. But logical replication slot on master did not disconnect and it is still active for certain PID.
I dropped subscription on master (I am still only testing) and tried to repeat the whole process with new logical replication slot but I am facing strange situation.
I cannot create new logical replication slot with the new name. Process running on the old logical replication slot is still active and showing wait_event_type=Lock and wait_event=transaction.
When I try to use pg_create_logical_replication_slot to create new logical replication slot I get similar situation. New slot is created - I see it in pg_catalog but it is marked as active for the PID of the session which issued this command and command hangs indefinitely. When I check processes I can see this command active with same waiting values Lock/transaction.
I tried to activate parameter "lock_timeout" in postgresql.conf and reload configuration but it did not help.
Killing that old hanging process will most likely bring down the whole postgres because it is "walsender" process. It is visible in processes list still with IP of replica with status "idle wating".
I tried to find some parameter(s) which could help me to force postgres to stop this walsender. But settings wal_keep_segments or wal_sender_timeout did not change anything. I even tried to stop replica for longer time - no effect.
Is there some way to do something with this situation without restarting the whole postgres? Like forcing timeout for walsender or lock for transaction etc...
Because if something like this happens on production I would not be able to use restart or any other "brute force". Thanks...
UPDATE:
"Walsender" process "died out" after some time but log does not show anything about it so I do not know when exactly it happened. I can only guess it depends on tcp_keepalives_* parameters. Default on Debian 9 is 2 hours to keep idle process. So I tried to set these parameters in postgresql.conf and will see in following tests.
Strangely enough today everything works without any problems and no matter how I try to simulate yesterday's problems I cannot. Maybe there were some network communication problems in the cloud datacenter involved - we experienced some occasional timeouts in connections into other databases too.
So I really do not know the answer except for "wait until walsender process on master dies" - which can most likely be influenced by tcp_keepalives_* settings. Therefore I recommend to set them to some reasonable values in postgresql.conf because defaults on OS are usually too big.
Actually we use it on our big analytical databases (set both on PostgreSQL and OS) because of similar problems. Golang and nodejs programs calculating statistics from time to time failed to recognize that database session ended or died out in some cases and were hanging until OS ended the connection after 2 hours (default on Debian). All of it seemed to be always connected with network communication problems. With proper tcp_keepalives_* setting reaction is much quicker in case of problems.
After old walsender process dies on master you can repeat all steps and it should work. So looks like I just had bad luck yesterday...
Postgress follows MVCC rules. So any query that is run on a table doesn't conflict with the writes that happen on the table. The query returns the result based on the snapshot at the point of running the query.
Now i have a master and slave. The slave is used by analysts to run queries and to perform analysis. When the slave is replicating and when analyst are running their queries simultaneously, i can see the replication lag for a long time.If the queries are long running, the replication lags a long duration and if the number of writes on the master happens to be pretty high, then i end up losing the WAL files and replication can longer proceed. I just have to spin up another slave. Why does this happen ? How do i allow queries and replication to happen simultaneously on postures ? Is there any parameter setting that i can apply to make this happen ?
The replica can't apply more WAL from the master because the master might've overwritten data blocks still needed by queries running on the replica that're older than any still running on the master. The replica needs older row versions than the master. It's exactly because of MVCC that this pause is necessary.
You probably set a high max_standby_streaming_delay to avoid "canceling statement due to conflict with recovery" errors.
If you turn hot_standby_feedback on, the replica can instead tell the master to keep those rows. But the master can't clean up free space as efficiently then, and it might run out of space in pg_xlog if the standby gets way too far behind.
See PostgreSQL manual: Handling Query Conflicts.
As for the WAL retention part: enable WAL archiving and a restore_command for your standbys. You should really be using it anyway, for point-in-time recovery. PgBarman now makes this easy with the barman get-wal command. If you don't want WAL archiving you can instead set your replica servers up to use a replication slot to connect to the master, so the master knows to retain the WAL they need indefinitely. Of course, that can cause the master to run out of space in pg_xlog and stop running so you need to monitor more closely if you do that.
I am migrating Oracle database into Postgres and I need to implement replication of the master database to multiple slave dbs. The replication should be executed once a day at specified time (to take the load off the dbs) and only replicate the data that was changed.
I am trying to achieve that using Slony - it seems to do what I need except it syncs the data in short intervals. I haven't been able to find any information on how to configure Slony for scheduled sync, is it even possible?
Or do I have to launch slon daemons at desired time and then kill them using some script/scheduler?
Yes, you can specify lag_interval to set sync within specified interval, here is the documentation . You can specify this option for Slony daemon with -l option.