Execute function returning data in remote PostgreSQL database from local PostgreSQL database - postgresql

Postgres version: 9.3.4
I have the need to execute a function which resides in a remote database. The function returns a table of statistic data based on the parameters given.
I am in effect only mirroring the function in my local database to lock down access to this function using my database roles and grants.
I have found the following which seem to only provide table-based access.
http://www.postgresql.org/docs/9.3/static/postgres-fdw.html
http://multicorn.org/foreign-data-wrappers/#idsqlalchemy-foreign-data-wrapper
First question: is that correct or are there ways to use these libraries for non-table based operations?
I have found the following which seems to provide me with any SQL operation on the foreign database. The negative seems to be increased complexity and reduced performance due to manual connection and error handling.
http://www.postgresql.org/docs/9.3/static/dblink.html
Second question: are these assumptions correct, and are there any ways to bypass these concerns or libraries/samples one can begin from?

The fdw interface provides a way to make a library which can allow a postgresql database to query any external data source as though it was a table. From that point of view, it could do what you want.
The inbuilt postgresql_fdw driver, however, does not allow you to specify a function as a remote table.
You could write your own fdw driver, possibly using the multicorn library, or some other language. That is likely to be a bit of work though, and would have some specific disadvantages, in particular I don't know how you would pass parameters to the function.
dblink is probably going to be the easiest solution. It allows you to execute arbitrary SQL on the remote server, returning a set of records.
SELECT *
FROM dblink('dbname=mydb', 'SELECT * FROM thefunction(1,2,3)')
AS t1(col1 INTEGER, col2 INTEGER);
There are other potential solutions but they would all be more effort to set up.

Related

Is it possible to evaluate a Postgres expression without connecting to a database?

PostgreSQL has excellent support for evaluating JSONPath expressions against JSON data.
For example, this query returns true because the value of the nested field is indeed "foo".
select '{"header": {"nested": "foo"}}'::jsonb #? '$.header ? (#.nested == "foo")'
Notably this query does not reference any schemas or tables. Ideally, I would like to use this functionality of PostgreSQL without creating or connecting to a full database instance. Is it possible to run PostgreSQL in such a way that it doesn't have schemas or tables, but is still able to evaluate "standalone" queries?
Some other context on the project, we need to evaluate JSONPath expressions against JSON data in both a Postgres database and Python application. Unfortunately, Python does not have any JSONPath libraries that support enough of the spec to be useful to us.
Ideally, I would like to use this functionality of PostgreSQL without creating or connecting to a full database instance.
Well, it is open source. You can always pull out the source code for this functionality you want and adapt it to compile by itself. But that seems like a large and annoying undertaking, and I probably wouldn't do it. And short of that, no.
Why do you need this? Are you worried about scalability or ease of installation or performance or what? If you are already using PostgreSQL anyway, firing up a dummy connection to just fire some queries at the JSONB engine doesn't seem too hard.

Using variables for schema and table names in a Redshift query

I want to be able to use the variable names in Redshift which refers to my DB Objects (like schema and table names). Something like...
SET my_schema="schema":
SET my_table="table";
SELECT * from #my_schema.#my_table;
But looks like Redshift doesn't have such feature. Is there any workaround possible to achieve this?
There are a few ways you try to attack this. But first trying to use a database engine for functions beyond querying the database is a waste of horsepower and the road to db lock-in. So I'm going to focus on ways to do this before the database.
The most complete way is to use a front-end system that clients connect to and then this system in turn connects to the db. The one I've used in the past is pgbounce-rr which pools connections to the the db but also allow for modifications to the SQL before being sent on. This will do what you want but you will need a computer to perform this work.
If you use Redshift data-api you could put a Lambda function in series which performs the SQL modifications you desire (but make sure you get your API permissions right). However, I expect it is unlikely that you are looking to move to an API access model.
Many benches support variable substitution and simple replacements in the SQL can be done by the bench. However, this is very dependent on which bench you use and having all users' benches configured correctly.
Bottom line - if you want something to modify your SQL do if before it goes to Redshift.

In DBeaver, how can I run an SQL union query from two different connections..?

We recently migrated a large DB2 database to a new server. It got trimmed a lot in the migration, for instance 10 years of data chopped down to 3, to name a few. But now I find that I need certain data from the old server until after tax season.
How can I run a UNION query in DBeaver that pulls data from two different connections..? What's the proper syntax of the table identifiers in the FROM and JOIN keywords..?
I use DBeaver for my regular SQL work, and I cannot determine how to span a UNION query across two different connections. However, I also use Microsoft Access, and I easily did it there with two Pass-Through queries that are fed to a native Microsoft Access union query.
But how to do it in DBeaver..? I can't understand how to use two connections at the same time.
For instance, here are my connections:
And I need something like this...
SELECT *
FROM ASP7.F_CERTOB.LDHIST
UNION
SELECT *
FROM OLD.VIPDTAB.LDHIST
...but I get the following error, to which I say "No kidding! That's what I want!", lol... =-)
SQL Error [56023]: [SQL0512] Statement references objects in multiple databases.
How can this be done..?
This is not a feature of DBeaver. DBeaver can only access the data that the DB gives it, and this is restricted to a single connection at a time (save for import/export operations). This feature is being considered for development, so keep an eye out for this answer to be outdated sometime in 2019.
You can export data from your OLD database and import it into ASP7 using DBeaver (although vendor tools for this are typically more efficient for this). Then you can do your union as suggested.
Many RDBMS offer a way to logically access foreign databases as if they were local, in which case DBeaver would then be able to access the data from the OLD database (as far as DBeaver is concerned in this situation, all the data is coming from a single connection). In Postgres, for example, one can use a foreign data wrapper to access foreign data.
I'm not familiar with DB2, but a quick Google search suggests that you can set up foreign connections within DB2 using nicknames or three-part-names.
If you check this github issue:
https://github.com/dbeaver/dbeaver/issues/3605
The way to solve this is to create a task and execute it in different connections:
https://github.com/dbeaver/dbeaver/issues/3605#issuecomment-590405154

Upsert in Amazon RedShift without Function or Stored Procedures

As there is no support for user defined functions or stored procedures in RedShift, how can i achieve UPSERT mechanism in RedShift which is using ParAccel, a PostgreSQL 8.0.2 fork.
Currently, i'm trying to achieve UPSERT mechanism using IF...THEN...ELSE... statement
e.g:-
IF NOT EXISTS(SELECT...WHERE(SELECT..))
THEN INSERT INTO tblABC() SELECT... FROM tblXYZ
ELSE UPDATE tblABC SET.,.,.,. FROM tblXYZ WHERE...
which is giving me error. As i'm writing this code independently without including it in function or SP's.
So, is there any solution to achieve UPSERT.
Thanks
You should probably read this article on upsert by depesz. You can't rely on SERIALIABLE for this since, AFAIK, ParAccel doesn't support full serializability support like in Pg 9.1+. As outlined in that post, you can't really do what you want purely in the DB anyway.
The short version is that even on current PostgreSQL versions that support writable CTEs it's still hard. On an 8.0 based ParAccel, you're pretty much out of luck.
I'd do a staged merge. COPY the new data to a temporary table on the server, LOCK the destination table, then do an UPDATE ... FROM followed by an INSERT INTO ... SELECT. Doing the data uploads in big chunks and locking the table for the upserts is reasonably in keeping with how Redshift is used anyway.
Another approach is to externally co-ordinate the upserts via something local to your application cluster. Have all your tools communicate via an external tool where they take an "insert-intent lock" before doing an insert. You want a distributed locking tool appropriate to your system. If everything's running inside one application server, it might be as simple as a synchronized singleton object.

Writing scripts for PostgreSQL to update database?

I need to write an update script that will check to see if certain tables, indexes, etc. exist in the database, and if not, create them. I've been unable to figure out how to do these checks, as I keep getting Syntax Error at IF messages when I type them into a query window in PgAdmin.
Do I have to do something like write a stored procedure in the public schema that does these updates using Pl/pgSQL and execute it to make the updates? Hopefully, I can just write a script that I can run without creating extra database objects to get the job done.
If you are on PostgreSQL 9.1, you can use CREATE TABLE ... IF NOT EXISTS
On 9.0 you can wrap your IF condition code into a DO block: http://www.postgresql.org/docs/current/static/sql-do.html
For anything before that, you will have to write a function to achieve what you want.
Have you looked into pg_tables?
select * from pg_tables;
This will return (among other things) the schemas and tables that exist in the database. Without knowing more of what you're looking for, this seems like a reasonable place to start.