Postgresql logical replication, duplicate key errors on subscriber - postgresql

I am still quite new to postgres logical replication and am having trouble when replicating a large data set.
In our development setup there is one publisher with 5 subscribers, all tables in one schema are being replicated. All servers are running pg 13.1, the subscribers are basically all clones of the same system.
Once a month or so we have to clear down most of the tables being replicated and re-populate them from a legacy system, a process that starts with deleting a chunk of data from the table (as defined by part of the key) and then copying that chunk of data across for each table. The size of the data is around 90GB all told.
Every time we do this one or more of the subscribers will get stuck (not always the same ones), looking at the logs on the publisher it shows for the stuck subscriber(s) "could not send data to client: Connection reset by peer".
Looking at the logs on the subscriber(s) it shows duplicate key errors "ERROR: duplicate key value violates unique constraint" but, from the key it shows, it will be a different row on each server (though often the same table).
Deleting the offending row on the subscriber simply makes it then fall over at the next row (so it's obviously more than just one row).
This makes no sense to me, nothing else is writing to the tables on these subscribers and I can't really picture a situation where replication would be trying to write the same data twice.
So far the only solution I have is to drop the bad subscriber(s) and restart replication on them.
Does anyone have any advice or ideas as to why this happens or how to fix it?

Related

Insert data into remote DB tables from multiple databases through trigger or replication or foreign data wrapper

I need some advice about the following scenario.
I have multiple embedded systems supporting PostgreSQL database running at different places and we have a server running on CentOS at our premises.
Each system is running at remote location and has multiple tables inside its database. These tables have the same names as the server's table names, but each system has different table name than the other systems, e.g.:
system 1 has tables:
sys1_table1
sys1_table2
system 2 has tables
sys2_table1
sys2_table2
I want to update the tables sys1_table1, sys1_table2, sys2_table1 and sys2_table2 on the server on every insert done on system 1 and system 2.
One solution is to write a trigger on each table, which will run on every insert of both systems' tables and insert the same data on the server's tables. This trigger will also delete the records in the systems after inserting the data into server. The problem with this solution is that if the connection with the server is not established due to network issue than that trigger will not execute or the insert will be wasted. I have checked the following solution for this
Trigger to insert rows in remote database after deletion
The second solution is to replicate tables from system 1 and system 2 to the server's tables. The problem with replication will be that if we delete data from the systems, it'll also delete the records on the server. I could add the alternative trigger on the server's tables which will update on the duplicate table, hence the replicated table can get empty and it'll not effect the data, but it'll make a long tables list if we have more than 200 systems.
The third solution is to write a foreign table using postgres_fdw or dblink and update the data inside the server's tables, but will this effect the data inside the server when we delete the data inside the system's table, right? And what will happen if there is no connectivity with the server?
The forth solution is to write an application in python inside each system which will make a connection to server's database and write the data in real time and if there is no connectivity to the server than it will store the data inside the sys1.table1 or sys2.table2 or whatever the table the data belongs and after the re-connect, the code will send the tables data into server's tables.
Which option will be best according to this scenario? I like the trigger solution best, but is there any way to avoid the data loss in case of dis-connectivity from the server?
I'd go with the fourth solution, or perhaps with the third, as long as it is triggered from outside the database. That way you can easily survive connection loss.
The first solution with triggers has the problems you already detected. It is also a bad idea to start potentially long operations, like data replication across a network of uncertain quality, inside a database transaction. Long transactions mean long locks and inefficient autovacuum.
The second solution may actually also be an option if you you have a recent PostgreSQL versions that supports logical replication. You can use a publication WITH (publish = 'insert,update'), so that DELETE and TRUNCATE are not replicated. Replication can deal well with lost connectivity (for a while), but it is not an option if you want the data at the source to be deleted after they have been replicated.

Bidirectional Replication Design: best way to script and execute unmatched row on Source DB to multiple subscriber DBs, sequentially or concurrently?

Thank you for help or suggestion offered.
I am trying to build my own multi-master replication on Postgresql 10 in Windows, for a situation which cannot use any of the current 3rd party tools for PG multimaster replication, which can also involve another DB platform in a subscriber group (Sybase ADS). I have the following logic to create bidirectional replication, partially inspired by Bucardo's logic, between 1 publisher and 2 subscribers:
When INSERT, UPDATE, or DELETE is made on Source table, Source table Trigger adds row to created meta table on Source DB that will act as a replication transaction to be performed on the 2 subscriber DBs which subcribe to it.
A NOTIFY signal will be sent to a service, or script written in Python or some scripting language will monitor for changes in the metatable or trigger execution and be able to do a table compare or script the statement to run on each subscriber database.
***I believe that triggers on the subscribers will need to be paused to keep them from pushing their received statements to their subscribers, i.e. if node A and node B both subscribe to each other's table A, then an update to node A's table A should replicate to node B's table A without then replicating back to table A in a bidirectional "ping-pong storm".
There will be a final compare between tables and the transaction will be closed. Re-enable triggers on subscribers if they were paused/disabled when pushing transactions from step 2 addendum.
This will hopefully be able to be done bidirectionally, in order of timestamp, in FIFO order, unless I can figure out a to create child processes to run the synchronizations concurrently.
For this, I am trying to figure out the best way to setup the service logic---essentially Step 2 above, which has apparently been done using a daemon in Linux, but I have to work in Windows, making it run as, or resembling, a service/agent---or come up with a reasonably easy and efficient design to send the source DBs statements to the subscribers DBs.
Does anyone see that this plan is faulty or may not work?
Disclaimer: I don't know anything about Postgresql but have done plenty of custom replication.
The main problem with bidirectional replication is merge issues.
If the same key is used in both systems with different attributes, which one gets to push their change? If you nominate a master it's easier. Then the slave just gets overwritten every time.
How much latency can you handle? It's much easier to take the 'notify' part out and just have a five minute windows task scheduler job that inspects log tables and pushes data around.
In other words, this kind of pattern:
Change occurs in a table. A database trigger on that table notes the change and writes the PK of the table to a change log table. A ReplicationBatch column in the log table is set to NULL by default
A windows scheduled task inspects all change log tables to find all changes that happened since the last run and 'reserves' these records by setting their replication state to a replication batch number
i.e. you run a UPDATE LogTable Set ReplicationBatch=BatchNumber WHERE ReplicationState IS NULL
All records that have been marked are replicated
you run a SELECT * FROM LogTable WHERE ReplicationState=RepID to get the records to be processed
When complete, the reserved records are marked as complete so the next time around only subsequent changes are replicated. This completion flag might be in the log table or it might be in a ReplicaionBatch number table
The main point is that you need to reserve records for replication, so that as you are replicating them out, additional log records can be added in from the source without messing up the batch
Then periodically you clear out the log tables.

PostgreSQL INSERT - auto-commit mode vs non auto-commit mode

I'm new to PostgreSQL and still learning a lot as I go. My company is using PostgreSQL and we are populating the database with tons of data. The data we collect is quite bulky in nature and is derived from certain types of video footage. For example, data related to about 15 minutes worth of video took me about 2 days to ingest into the database.
My problem is that I have data sets which relate to hours worth of video which would take weeks to ingest into the database. I was informed part of the reason this is taking so long to ingest was because PostgeSQK has auto commit set to true by default and committing transactions takes a lot of time/resources. I was informed that I could turn auto commit off, due to which the process would speed up tremendously. However, my concern is that multiple users are going to be populating this database. If i change the program to commit after say every 10 secords and two people are attempting to populate the same table. The first person gets an id and when he's on say record 7 then the second person attempts to insert into the same table they are given the same id key and once the first person decides to commit his changes, the second persons id key will already be used, thus throwing an error.
So what is the best way to insert data into a PostgreSQL database when multiple people are ingesting data at the same time? Is there a way to work around issuing out the same id key to multiple people when inserting data in auto-commit mode?
If the IDs are coming from the serial type or a PostgreSQL sequence (which is used by the serial type), then you never have to worry about two users getting the same ID from the sequence. It simply isn't possible. The nextval() function only ever hands out a given ID a single time.

Slony "duplicate key value violates unique constraint" error

I have a problem which goes on for longer time. I use slony to replicate database from master to slave and from that slave to three other backup servers. Once a 2-3 weeks there is a key duplication problem that happens only on one specific table (big but not biggest in database).
It started to occur like year ago on Postgres 8.4 and slony 1 and we switched to 2.0.1. Later we upgraded it to 2.0.4, and we succesfuly upgraded slony to 2.1.3 and it's our current version. We started fresh replication on same computers and it was all going well until today. We got the same duplication key error on same table (with different keys every time of course).
Solution to clean it up is just to delete invalid key on slaves (it spreads across all nodes) and it's all working again. Data is not corrupted. But source of problem remains unsolved.
In googles I found nothing related to this problem (we did not used truncate on any table, we did not change the structure of table).
Any ideas what can be done about it?
When this problem occured in our setup, it turned out that the schema of the master database was older than the slaves' and didn't have the UNIQUE constraint for this particular column. So, my advice would be:
make sure the master table has in fact the constraint
if not:
clean the table
add the constraint
else:
revoke write privileges from all clients except slony for the replicated tables.
As Craig has said usually this is a write transaction to a replica. So the first things to do is to verify permissions. If this keeps happening, what you can do is start logging connections of the readers of the replicas and keep them around so when the issue happens, you can track down where the bad tuple came from. This can generate a LOT of logs however so you probably want to see to what extent you can narrow this down first. You presumably know which replica this is starting on, so you can go from there.
A particular area of concern I would spot would be what happens if you have a user defined function which writes. A casual observer might not spot that in the query, nor might a connection pooler.

strange data remains

Does anyone heard or experienced about following phenomenon ?
Using postgresql 9.0.5 on Windows
= table structure =
[parent] - [child] - [grandchild]
I found out a record remained strangely on the [child] table.
This record exists violating the restriction of foreign key.
these tables store transaction data of my application
all the above tables have numeric PRIMARY KEY
all these tables have FOREIGN KEY restriction (between parent and child, grandchild)
my application updates each record status along with the transaction progress
my app copies this record to archive tables (same structure, same restrictions)
once the all status changed to "normal_end".
then, delete these records when it finished copy them to the archive tables.
the status of remained record on the [child] table was NOT "normal_end" but "processing".
but the status of copied data (same ID) in archive table was "normal_end".
no error reported at pg_log
I felt it very strange...
I suspect that the deleted data might came back to active !?
Can deleted data be active unexpected?
There should never be data that violates a foreign key constraint (except during a transaction with deferred constraints).
A deleted row should stay deleted once the transaction is committed. That's one of the requirements of ACID. However the correct working of PostgreSQL relies on the correct functioning of your os and hardware. When postgresql fsyncs a file it should really be written to disk or a non volatile cache. Unfortunatly it sometimes happens that disks or controllers tell the system the write has finished while it hasn't and is still in a volatile cache. If you have a raid controller with RAM but no battery make sure the controllers cache is set to write-through.
Personally I have seen PostgreSQL have incorrect data once, it had a duplicate row (same primary key) this was after a crash on a windows xp machine (this was most likely a 9.0.x). Windows XP machines are not very reliable running postgresql. They often give strange network errors.