How to avoid infinite loop with triggers between databases using postgres_fdw? - postgresql

schema of databases
I want to synchronize 3 Postgresql databases across a central pivot db.
For example, if I insert a row in DB1, it sends a query to pivot db with the extension postgresql_fdw and it sends insert query to db2 and db3. I have created 3 triggers with after insert in each database.
The problem: if I insert in db1, the pivot send this query to db2 and db3 which fire their trigger to insert in db2 and db3 in return. Infinite loop :). How can I solve this problem?

Normally, you can check the nesting level with pg_trigger_depth().
Like:
CREATE TRIGGER my_sync_trigger
BEFORE INSERT ON my_table
FOR EACH ROW
WHEN (pg_trigger_depth() < 1) -- cancel nested trigger invocation!
EXECUTE PROCEDURE my_sync_function();
How to prevent a PostgreSQL trigger from being fired by another trigger?
But I have not tested this with postgres_fdw across databases. I doubt it works transparently across databases. You'll have to test ...
A poor man's solution would be to add a boolean flag replicated to each table, set it to true when the row is replicated, and only fire the replication trigger when it's not true.
...
WHEN (NEW.replicated = false) -- cancel for replicated rows
...
But I can see all kinds of concurrency issues with this in a multi-user environment.
Have you considered one of the proven replication solutions? Find a list in the manual here.

Related

How to lock a SELECT in PostgreSQL?

I am used to do this in MySQL:
INSERT INTO ... SELECT ...
which would lock the table I SELECT from.
Now, I am trying to do something similar in PostgreSQL, where I select a set of rows in a table, and then I insert some stuff in other tables based on those rows values. I want to prevent having outdated data, so I am wondering how can I lock a SELECT in PostgresSQL.
There is no need to explicitly lock anything. A SELECT statement will always see a consistent snapshot of the table, no matter how long it runs.
The result will be no different if you lock the table against concurrent modifications before starting the SELECT, but you will harm concurrency unnecessarily.
If you need several queries to see a consistent state of the database, start a transaction with the REPEATABLE READ isolation level. Then all statements in the transaction will see the same state of the database.

How to obtain an indefinite lock on database in Rails (postgres database) for QA purposes?

I'm trying to obtain an indefinite lock on my Postgresql database (specifically on a table called orders) for QA purposes. In short, I want to know if certain locks on a table prevent or indefinitely block database migrations for adding columns (I think ALTER TABLE grabs an ACCESS EXCLUSIVE LOCK).
My plan is to:
grab a table lock or a row lock on the orders table
run the migration to add a column (an ALTER TABLE statement that grabs an ACCESS EXCLUSIVE LOCK)
issue a read statement to see if (2) is blocked (the ACCESS EXCLUSIVE LOCK blocks reads, and so this would be a problem that I'm trying to QA).
How would one do this? How do I grab a row lock on a table called orders via the Rails Console? How else could I do this?
Does my plan make sense?
UPDATE
It turns out open row-level transactions actually do block ALTER TABLE statements that grab an ACCESS EXCLUSIVE LOCK like table migrations that add columns. For example, when I run this code in one process:
Order.first.with_lock do
binding.pry
end
It blocks my migration in another process to add a column to the orders table. That migration's ACCESS EXCLUSIVE LOCK blocks all reads and select statements to the orders table, causing problems for end users.
Why is this?
Let's say you're in a transaction, selecting rows from a table with various where clauses. Halfway through, some other transaction adds a column to that table. Now you are getting back more fields than you did previously. How is your application supposed to handle this?

Postgres Sequence out of sync

I'm running a multi-master setup with bucardo and postgres.
I'm finding that some of my table sequences are getting out of sync with each other. Particularly the auto-incremented id.
example:
db1 - table1
INSERT INTO distributors (did, dname) VALUES (DEFAULT, 'XYZ Widgets')
The id of the new row is 1
db2 - table1
INSERT INTO distributors (did, dname) VALUES (DEFAULT, 'XYZ Widgets')
The id of the new row is 1
The id of the new row on db2 should be 2, because bucardo has replicated the data from db1, but db2's auto increment is based on:
nextval('oauth_sessions_id_seq'::regclass)
And if we check the "oauth_sessions_id_seq" we see the last value as 0.
phew... Make sense?
Anyway, can I do any of the following?
Replicate the session tables with bucardo, so each DB's session is shared?
Manipulate the default auto-increment function above to take into account the max existing items in the table?
If you have any better ideas, please feel free to throw them in. Questions just ask, thanks for any help.
You are going to have to change your id generation method, because there is no Bucardo solution according to this comment in the FAQ.
Can Bucardo replicate DDL?
No, Bucardo relies on triggers, and Postgres does not yet provide DDL
triggers or triggers on its system tables.
Since Bucardo uses triggers, it cannot "see" the sequence changes, only the data in tables, which it replicates. Sequences are interesting objects that do not support triggers, but you can manually update them. I suppose you could add something like the code below before the INSERT, but there still might be issues.
SELECT setval('oauth_sessions_id_seq', (SELECT MAX(did) FROM distributors));
See this question for more information.
I am not fully up on all the issues involved, but you could perform the maximum calculation manually and do the insert operation in a re-try loop. I doubt it will work if you are actually doing inserts on both DBs and allowing Bucardo to replicate, but if you can guarantee that only one DB updates at a time, then you could try something like an UPSERT retry loop. See this post for more info. The "guts" of the loop might look like this:
INSERT INTO distributors (did, dname)
VALUES ((SELECT max(did)+1 FROM distributors), 'XYZ Widgets');
Irrespective of the DB (PostgreSQL, Oracle, etc.), dynamic sequence was created for each of the table which has the primary key associated with it.
Most of the sequences go out of sync whenever a huge import of data is happened or some person has manually modified the sequence of the table.
Solution: The only way we can set back the sequence is by taking the max value of the PK table and set the sequence next val to it.
The below query will list you out all the sequences created in your DB schema:
SELECT c.relname FROM pg_class c WHERE c.relkind = 'S';
SELECT MAX('primary_key') from table;
SELECT setval('the_primary_key_sequence', (SELECT MAX(the_primary_key) FROM the_table)+1);

INSERT statement that does not fire an INSERT trigger

I am using PostgreSQL 9.2 and I need to write an INSERT statement which copies data from table A to table B without firing the INSERT trigger defined on table B (maybe some sort of bulk insertion operation??).
On this specific table (table B) many INSERT, UPDATE and DELETE operations are executed. During each and every one of this executions, a trigger must fire.
I cannot temporary disable the triggers because of standard, day-to-day DML operations.
Can anyone help me with the syntax for this non-trigger-firing INSERT statement?
Run your "privileged" inserts as a different user. That way your trigger can check the current user and exit if it shouldn't do anything.

wrapping postgresql commands in a transaction: truncate vs delete or upsert/merge

I am using the following commands below in postgresql 9.1.3 to move data from a temp staging table to a table being used in a webapp (geoserver) all in the same db. Then dropping the temp table.
TRUNCATE table_foo;
INSERT INTO table_foo
SELECT * FROM table_temp;
DROP TABLE table_temp;
I want to wrap this in a transaction to allow for concurrency. The data-set is small less than 2000 rows and truncating is faster than delete.
What is the best way to run these commands in a transaction?
Is creating a function advisable or writing a UPSERT/MERGE etc in a CTE?
Would it be better to DELETE all rows then bulk INSERT from temp table instead of TRUNCATE?
In postgres which would allow for a roll back TRUNCATE or DELETE?
The temp table is delivered daily via an ETL scripted in arcpy how could I automate the truncate/delete/bulk insert parts within postgres?
I am open to using PL/pgsql, PL/python (or the recommended py for postgres)
Currently I am manually executing the sql commands after the temp staging table is imported into my DB.
Both, truncate and delete can be rolled back (which is clearly documented in the manual).
truncate - due to its nature - has some oddities regarding the visibility.
See the manual for details: http://www.postgresql.org/docs/current/static/sql-truncate.html (the warning at the bottom)
If your application can live with the fact that table_foo is "empty" during that process, truncate is probably better (again see the big red box in the manual for an explanation). If you don't want the application to notice, you need to use delete
To run these statements in a transaction simply put them into one:
begin transaction;
delete from table_foo;
insert into ....
drop table_temp;
commit;
Whether you do that in a function or not is up to you.
truncate/insert will be faster (than delete/insert) as that minimizes the amount of WAL generated.