Unable to get dblink to work on read replica - postgresql

We have 2 databases on AWS RDS, OMS and SFMS, with each database having its own read replica. We use dblink in SFMS to fetch data of Table A from OMS. It works perfectly on my SFMS db instance with Master role, but get an ERROR: could not establish connection on our read replica DB.
Here is how I have setup the dblink:
SELECT * FROM dblink(
'dbname=<DB End Point> user=<username> password=<password>',
'SELECT id, <Other fields> from A') AS oms_A
(id int, <Remaining Schema>)
I can always create a materialized view on SFMS to get it to work. Is their some mistake that I am making while setting up DBLink to use it on a read replica instance?

This works on Aiven's PostgreSQL service. Please checkout aiven.io.
To set it up you first need to create the extension on the master server with 'CREATE EXTENSION dblink;'
The foreign server definitions and user name mappings also have to be created on the master which will then replicate them to the read-only replicas.
Once those are setup you can do things like: SELECT dblink_connect('myconn', 'db2remote'); and SELECT * FROM dblink('myconn','SELECT * FROM foo') AS t(id int); on the read-replica.
Hope this helps.

Related

PostgreSQL 11.16 cannot execute CREATE TABLE in a read-only transaction

I have a PostgreSQL database running on an Azure machine. When I try to create a table on a database, I get an error "cannot execute CREATE TABLE in a read-only transaction". The SQL query is being executed by a python script using a sqlalchemy engine. But I tried a similar query in PGAdmin installed on my machine and I get the same error. And I noticed that I do not have this issue if I connect to the database from a colleague's machine.
After further research, I found that if I execute SELECT pg_is_in_recovery(); in my PGAdmin, it returns true. And false on my colleague's machine.
Let me know if there is any way to correct this
SELECT pg_is_in_recovery() - returned true = Database has only Read Acces
can you check your permission?
you can check postgresql.conf file and atribute default_transaction_read_only
or try this:
begin;
set transaction read write;
alter database exercises set default_transaction_read_only = off;
commit;
The issue was that our posgtresql machine is a HA machine, and that I was connecting to an IP address rather than the domain.

Dbt labs access remote Postgres DB server using db link

I am new to DBT. I understand that DBT is used for transforming data once the raw data is available in your DWH. However I was trying to see if anyone has used DBT to do an initial select/import of data into staging tables from a remote database server using DB link in a select statement where both the staging database server and the remote database server are Postgres databases.
Also in my case the data volume isn’t much.
I was able to do some testing and looks like this is doable.
Below was the set-up:
The profiles.yml file was pointing to target DB server.
Below was the query used in the model targetTable.sql file.
{{ config(materialized='table') }}
SELECT i.col1,
i.col2
FROM dblink('dbname=sourceDBName port=portNo hostaddr=sourceHostName user=userID password=****',
'SELECT
a.col1,
a.col2
from
(SELECT
col1,col2
FROM
public.tableName) a
')
i (col1 integer, col2 varchar(20))

using pg_cron extension on Cloud SQL

I am trying to use pg_cron to schedule calls on stored procedure on several DBs in a Postgres Cloud SQL instance.
Unfortunately it looks like pg_cron can only be only created on postgres DB
When I try to use pg_cron on a DB different than postgres I get this message :
CREATE EXTENSION pg_cron;
ERROR: can only create extension in database postgres
Detail: Jobs must be scheduled from the database configured in
cron.database_name, since the pg_cron background worker reads job
descriptions from this database. Hint: Add cron.database_name =
'mydb' in postgresql.conf to use the current database.
Where: PL/pgSQL function inline_code_block line 4 at RAISE
Query = CREATE EXTENSION pg_cron;
... I don't think I have access to postgresql.conf in Cloud SQL ... is there another way ?
Maybe I could use postgres_fdw to achieve my goal ?
Thank you,
There's no need to edit any files. All you have to do is set the cloudsql.enable_pg_cron flag (see guide) and then create the extension in the postgres database.
You need to log onto the postgres database rather than the one you're using for your app. For me that's just replacing the name of my app database with 'postgres' e.g.
psql -U<username> -h<host ip> -p<port> postgres
Then simply run the create extension command and the cron.job table appears. Here's one I did a few minutes ago in our cloudsql database. I'm using the cloudsql proxy to access the remote db:
127.0.0.1:2345 admin#postgres=> create extension pg_cron;
CREATE EXTENSION
Time: 268.376 ms
127.0.0.1:2345 admin#postgres=> select * from cron.job;
jobid | schedule | command | nodename | nodeport | database | username | active | jobname
-------+----------+---------+----------+----------+----------+----------+--------+---------
(0 rows)
Time: 157.447 ms
Be careful to specify the correct target database when setting the schedule otherwise it will think that you want the job to run in the postgres database
.. I don't think I have access to postgresql.conf in Cloud SQL ...
Actually there is, you can use the patch command.
according to pg_cron doc, you need two change two things in the conf file:
shared_preload_libraries = 'pg_cron'
cron.database_name = 'another_table' #optionnaly to change the database where pg_cron background worker expects its metadata tables to be created
Now, according to gcloud
You need to set up two flags on your instance:
gcloud sql instances patch [my_instance] --database-flags=cloudsql.enable_pg_cron=on,cron.database_name=[my_name]
CAREFUL, don't use twice the command "patch" as you would erase your first setting. Put all your changes in one command
You also might want set cron.database_name in postgresql.conf (or flag in CloudSQL)
cron.database_name = mydatabase

testing replication from Citus to my RDS Aurora Postgres on subscriber no data is coming

I am testing replication from Citus(Cloud Hosted) to my RDS Aurora Postgres with ref https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraPostgreSQL.Replication.Logical.html#AuroraPostgreSQL.Replication.Logical.Configure
Everything ran successful but on subscribers, no data is coming so what may be wrong? how do I troubleshoot
SELECT count(*) FROM LogicalReplicationTest;
Note: on Citus DB I can See Publication:
But in my RDS can't see Subscriptions in the list in spite of CREATE SUBSCRIPTION testsub CONNECTION was successful:
I observed that when I run SELECT count(*) FROM LogicalReplicationTest; there is no data but only column name showing Lock symbol as shown in this screenshot! any idea what this lock symbol means apart from the Read-only Column? why it's not listing data inside that table which is in publisher DB Is there any permission I have to delegate to the table when creating a publication?

how do I create a dblink connection in postgres that I can keep using

so - I can create a dblink connection - eg
select * from dblink( 'dbname=whatever host=the_host user=the_user password=my_password', 'select x, y, z from blah')
works fine.
I can even make what appears to be a persistent connection
select * from dblink_connect( 'dev', 'dbname=whatever host=the_host user=the_user password=my_password');
select * from dblink( 'dev', 'select x, y, z from blah' );
works fine.
For a while.
And then after a while - if I try to use dev again - it starts telling me "no open connection". But if I try to run the connect command again, it tells me a connection with that name already exists.
So how do I establish a named connection that I, and others, can just use directly forever afterwards without having to do any sort of connect/disconnect?
You can give dblink() the name of a foreign server, rather than the name of a connection.
create server dev foreign data wrapper dblink_fdw options (host 'thehost', dbname 'whatever');
create user mapping for public server dev options (user 'the_user', password 'my_password');
Then run your dblink query just as you currently are, using 'dev' as the name.
Note that this will increase the number of connections done, it is just that the system manages them so that you don't need to. So it is good for convenience, but not for performance.
The documentation says:
The connection will persist until closed or until the database session is ended.
So I suspect that you are using a connection pool, and
you may get a different database session for each transaction (but the dblink connection is open in only one of them)
the connection pool may close the backend connections after a while, thereby also closing the dblink connection
If you want to use a feature like dblink, where sessions outlive the duration of a transaction, you need session level pooling.