Different select results when using multimaster via pglogical in PostgreSQL - postgresql

There are two PostgreSQL 9.6 nodes subscribed to each other via pglogical. If node A inserts a row into the replicated table then node B sees it and vice versa.
However, when I update a row on one node, then subsequent SELECT queries on both nodes will keep returning different results - the current one and some of the previous ones.
Moreover, there are log entries about replication conflicts in the logs of both nodes.
Why does that happen and how do I fix that?
upd: setting pglogical.conflict_resolution to last_update_wins helps. Might consider other options of conflict resolution too

Multi-master replication is difficult.
There are conflicts that are bound to occur unless your application is aware of and specifically tailored to multi-master replication:
Rows inserted on different nodes with the same (automatically generated primary key must conflict.
If you modify the primary key of a row on one node while updating or deleting it on another, the databases will “drift apart”, leading to future conflicts.
You will have to fix your application so that it avoids problems like the above, and you will have to manually find and resolve all the conflicts that occurred so far.
Here is an example of the second case:
-- node one:
UPDATE person
SET id = 1234
WHERE id = 6543;
-- at the same time on node two
DELETE FROM person
WHERE id = 6543;
Both statements will be replicated to the other node, but do nothing there, because both nodes no longer have a person with id 6543 any more. There will be no replication conflict right away, but node one now has a person that node two doesn't have. It is easy to see how this can lead to replication conflicts later (imagine you insert a row on node one that has a foreign key relationship to person 1234).
This is why it is in most cases a good idea to consider an architecture that does not include multi-master replication.

Related

How does pglogical-2 handle logical replication on same table while allowing it to be writeable on both databases?

Based on the above image, there are certain tables I want to be in the Internal Database (right hand side). The other tables I want to be replicated in the external database.
In reality there's only one set of values that SHOULD NOT be replicated across. The rest of the database can be replicated. Basically the actual price columns in the prices table cannot be replicated across. It should stay within the internal database.
Because the vendors are external to the network, they have no access to the internal app.
My plan is to create a replicated version of the same app and allow vendors to submit quotations and picking items.
Let's say the replicated tables are at least quotations and quotation_line_items. These tables should be writeable (in terms of data for INSERTs, UPDATEs, and DELETEs) at both the external database and the internal database. Hence at both databases, the data in the quotations and quotation_line_items table are writeable and should be replicated across in both directions.
The data in the other tables are going to be replicated in a single direction (from internal to external) except for the actual raw prices columns in the prices table.
The quotation_line_items table will have a price_id column. However, the raw price values in the prices table should not appear in the external database.
Ultimately, I want the data to be consistent for the replicated tables on both databases. I am okay with synchronous replication, so a bit of delay (say, a couple of second for the write operations) is fine.
I came across pglogical https://github.com/2ndQuadrant/pglogical/tree/REL2_x_STABLE
and they have the concept of PUBLISHER and SUBSCRIBER.
I cannot tell based on the readme which one would be acting as publisher and subscriber and how to configure it for my situation.
That won't work. With the setup you are dreaming of, you will necessarily end up with replication conflicts.
How do you want to prevent that data are modified in a conflicting fashion in the two databases? If you say that that won't happen, think again.
I believe that you would be much better off using a single database with two users: one that can access the “secret” table and one that cannot.
If you want to restrict access only to certain columns, use a view. Simple views are updateable in PostgreSQL.
It is possible with BDR replication which uses pglogical. On a basic level by allocating ranges of key ids to each node so writes are possible in both locations without conflict. However BDR is now a commercial paid for product.

Bidirectional Replication Design: best way to script and execute unmatched row on Source DB to multiple subscriber DBs, sequentially or concurrently?

Thank you for help or suggestion offered.
I am trying to build my own multi-master replication on Postgresql 10 in Windows, for a situation which cannot use any of the current 3rd party tools for PG multimaster replication, which can also involve another DB platform in a subscriber group (Sybase ADS). I have the following logic to create bidirectional replication, partially inspired by Bucardo's logic, between 1 publisher and 2 subscribers:
When INSERT, UPDATE, or DELETE is made on Source table, Source table Trigger adds row to created meta table on Source DB that will act as a replication transaction to be performed on the 2 subscriber DBs which subcribe to it.
A NOTIFY signal will be sent to a service, or script written in Python or some scripting language will monitor for changes in the metatable or trigger execution and be able to do a table compare or script the statement to run on each subscriber database.
***I believe that triggers on the subscribers will need to be paused to keep them from pushing their received statements to their subscribers, i.e. if node A and node B both subscribe to each other's table A, then an update to node A's table A should replicate to node B's table A without then replicating back to table A in a bidirectional "ping-pong storm".
There will be a final compare between tables and the transaction will be closed. Re-enable triggers on subscribers if they were paused/disabled when pushing transactions from step 2 addendum.
This will hopefully be able to be done bidirectionally, in order of timestamp, in FIFO order, unless I can figure out a to create child processes to run the synchronizations concurrently.
For this, I am trying to figure out the best way to setup the service logic---essentially Step 2 above, which has apparently been done using a daemon in Linux, but I have to work in Windows, making it run as, or resembling, a service/agent---or come up with a reasonably easy and efficient design to send the source DBs statements to the subscribers DBs.
Does anyone see that this plan is faulty or may not work?
Disclaimer: I don't know anything about Postgresql but have done plenty of custom replication.
The main problem with bidirectional replication is merge issues.
If the same key is used in both systems with different attributes, which one gets to push their change? If you nominate a master it's easier. Then the slave just gets overwritten every time.
How much latency can you handle? It's much easier to take the 'notify' part out and just have a five minute windows task scheduler job that inspects log tables and pushes data around.
In other words, this kind of pattern:
Change occurs in a table. A database trigger on that table notes the change and writes the PK of the table to a change log table. A ReplicationBatch column in the log table is set to NULL by default
A windows scheduled task inspects all change log tables to find all changes that happened since the last run and 'reserves' these records by setting their replication state to a replication batch number
i.e. you run a UPDATE LogTable Set ReplicationBatch=BatchNumber WHERE ReplicationState IS NULL
All records that have been marked are replicated
you run a SELECT * FROM LogTable WHERE ReplicationState=RepID to get the records to be processed
When complete, the reserved records are marked as complete so the next time around only subsequent changes are replicated. This completion flag might be in the log table or it might be in a ReplicaionBatch number table
The main point is that you need to reserve records for replication, so that as you are replicating them out, additional log records can be added in from the source without messing up the batch
Then periodically you clear out the log tables.

How do I cluster my PRIMARY KEY in postgres

I noticed in postgres when we create a table, it seems to automatically creates a btree index on the PRIMARY KEY CONSTRAINT. Looking at the properties of the CONSTRAINT, it would appear it is not clustered. How do I cluster it and should I cluster it?
You have to use the CLUSTER command:
CLUSTER stone.unitloaddetail USING pk10;
Remember that this rewrites the table and blocks it for others during that time.
Also, clustering is not maintained when table data are modified, so you have to schedule regular CLUSTER runs if you want to keep the table clustered.
Addressing the "should you" part, it depends on the likelihood of queries needing to access multiple rows having adjacent values of the clustering key.
For a table with a synthetic primary key, it probably makes more sense to cluster on a foreign key column.
Imagine that you have a table or products. Are you more likely to request multiple products having:
consecutive product_id?
the same location_id?
the same type_id?
the same manufacturer_id?
If it would solve a problem for you to improve the performance of the system for one of these particular cases, then that is the column by which you should consider clustering.
If doing so would not solve a problem, then do not do it.

Stop BDR from replicating DROP TABLE or CREATE TABLE

I have two databases with tables that I want to sync. I don't want to sync any other table. I'm using Postgres-BDR to do that.
Those tables are part of replication set common. There are some circumstances where other tables share a name across nodes (but are NOT in common), and a node will call DROP TABLE and then CREATE TABLE. Even though those tables aren't part of the common replication set, these commands are still replicated to the other nodes, causing the other node to lose all of its data in its table and then create an empty table.
How can I stop this? I only want commands that affect common to be replicated to the other nodes.
Nevermind, I found it. It's available with bdr.skip_ddl_replication.
I just put bdr.skip_ddl_replication = on in postgresql.conf, restarted the server, and BOOM! Works like a charm.
EDIT
It would be prudent of me to point out that the documentation warns that this option could break database replication if used improperly. But since I'll be VERY tightly controlling the table schema, it shouldn't cause any problems.

Slony "duplicate key value violates unique constraint" error

I have a problem which goes on for longer time. I use slony to replicate database from master to slave and from that slave to three other backup servers. Once a 2-3 weeks there is a key duplication problem that happens only on one specific table (big but not biggest in database).
It started to occur like year ago on Postgres 8.4 and slony 1 and we switched to 2.0.1. Later we upgraded it to 2.0.4, and we succesfuly upgraded slony to 2.1.3 and it's our current version. We started fresh replication on same computers and it was all going well until today. We got the same duplication key error on same table (with different keys every time of course).
Solution to clean it up is just to delete invalid key on slaves (it spreads across all nodes) and it's all working again. Data is not corrupted. But source of problem remains unsolved.
In googles I found nothing related to this problem (we did not used truncate on any table, we did not change the structure of table).
Any ideas what can be done about it?
When this problem occured in our setup, it turned out that the schema of the master database was older than the slaves' and didn't have the UNIQUE constraint for this particular column. So, my advice would be:
make sure the master table has in fact the constraint
if not:
clean the table
add the constraint
else:
revoke write privileges from all clients except slony for the replicated tables.
As Craig has said usually this is a write transaction to a replica. So the first things to do is to verify permissions. If this keeps happening, what you can do is start logging connections of the readers of the replicas and keep them around so when the issue happens, you can track down where the bad tuple came from. This can generate a LOT of logs however so you probably want to see to what extent you can narrow this down first. You presumably know which replica this is starting on, so you can go from there.
A particular area of concern I would spot would be what happens if you have a user defined function which writes. A casual observer might not spot that in the query, nor might a connection pooler.