Writing scripts for PostgreSQL to update database? - postgresql

I need to write an update script that will check to see if certain tables, indexes, etc. exist in the database, and if not, create them. I've been unable to figure out how to do these checks, as I keep getting Syntax Error at IF messages when I type them into a query window in PgAdmin.
Do I have to do something like write a stored procedure in the public schema that does these updates using Pl/pgSQL and execute it to make the updates? Hopefully, I can just write a script that I can run without creating extra database objects to get the job done.

If you are on PostgreSQL 9.1, you can use CREATE TABLE ... IF NOT EXISTS
On 9.0 you can wrap your IF condition code into a DO block: http://www.postgresql.org/docs/current/static/sql-do.html
For anything before that, you will have to write a function to achieve what you want.

Have you looked into pg_tables?
select * from pg_tables;
This will return (among other things) the schemas and tables that exist in the database. Without knowing more of what you're looking for, this seems like a reasonable place to start.

Related

Using variables for schema and table names in a Redshift query

I want to be able to use the variable names in Redshift which refers to my DB Objects (like schema and table names). Something like...
SET my_schema="schema":
SET my_table="table";
SELECT * from #my_schema.#my_table;
But looks like Redshift doesn't have such feature. Is there any workaround possible to achieve this?
There are a few ways you try to attack this. But first trying to use a database engine for functions beyond querying the database is a waste of horsepower and the road to db lock-in. So I'm going to focus on ways to do this before the database.
The most complete way is to use a front-end system that clients connect to and then this system in turn connects to the db. The one I've used in the past is pgbounce-rr which pools connections to the the db but also allow for modifications to the SQL before being sent on. This will do what you want but you will need a computer to perform this work.
If you use Redshift data-api you could put a Lambda function in series which performs the SQL modifications you desire (but make sure you get your API permissions right). However, I expect it is unlikely that you are looking to move to an API access model.
Many benches support variable substitution and simple replacements in the SQL can be done by the bench. However, this is very dependent on which bench you use and having all users' benches configured correctly.
Bottom line - if you want something to modify your SQL do if before it goes to Redshift.

PostgreSQL - implicit transactions analogue

I am using PostgreSQL 10 from RDS (AWS).
So note that I don't have full permissions to do whatever I want.
In PostgreSQL I have some functions written in PL/pgSQL.
From my experience in these function I cannot start/commit/rollback transactions. In a DO block I cannot do that either.
Is that correct? So what is the logic behind this... seems PostgreSQL expects each function to be called in the context of an existing transaction. Right?
But what if I want every statement in my function to be executed in a separate (short) transaction i.e. to have a behavior something like AUTOCOMMIT = ON?
I found some extension which maybe can do that but I am not sure.
I don't know if it's relevant.
https://www.postgresql.org/docs/10/ecpg-sql-set-autocommit.html
Isn't there a standard way of doing this in Postgres without the need to download and install additional packages/extensions?
Again: I want every statement in my function to be executed in a separate (short) transaction i.e. to have a behavior something like AUTOCOMMIT = ON.
So I want something like this:
https://learn.microsoft.com/en-us/sql/t-sql/statements/set-implicit-transactions-transact-sql?view=sql-server-2017
All statements in a function run in the same transaction, and no plugin can change that.
You can use procedures from v11 on, but you still have to explicitly manage transactions then.
I suspect that the best thing would be to run your functions on the database client, where you have autocommit automatically, rather than as a function in the database.

Execute function returning data in remote PostgreSQL database from local PostgreSQL database

Postgres version: 9.3.4
I have the need to execute a function which resides in a remote database. The function returns a table of statistic data based on the parameters given.
I am in effect only mirroring the function in my local database to lock down access to this function using my database roles and grants.
I have found the following which seem to only provide table-based access.
http://www.postgresql.org/docs/9.3/static/postgres-fdw.html
http://multicorn.org/foreign-data-wrappers/#idsqlalchemy-foreign-data-wrapper
First question: is that correct or are there ways to use these libraries for non-table based operations?
I have found the following which seems to provide me with any SQL operation on the foreign database. The negative seems to be increased complexity and reduced performance due to manual connection and error handling.
http://www.postgresql.org/docs/9.3/static/dblink.html
Second question: are these assumptions correct, and are there any ways to bypass these concerns or libraries/samples one can begin from?
The fdw interface provides a way to make a library which can allow a postgresql database to query any external data source as though it was a table. From that point of view, it could do what you want.
The inbuilt postgresql_fdw driver, however, does not allow you to specify a function as a remote table.
You could write your own fdw driver, possibly using the multicorn library, or some other language. That is likely to be a bit of work though, and would have some specific disadvantages, in particular I don't know how you would pass parameters to the function.
dblink is probably going to be the easiest solution. It allows you to execute arbitrary SQL on the remote server, returning a set of records.
SELECT *
FROM dblink('dbname=mydb', 'SELECT * FROM thefunction(1,2,3)')
AS t1(col1 INTEGER, col2 INTEGER);
There are other potential solutions but they would all be more effort to set up.

In postgres is possible to dynamically build a INSERT script from a table?

Similar to SQL script to create insert script, I need to generate a list of INSERT from a table to load onto another database (sqlite), like the dump command. I do this for sync purposes.
I have limitations because this will run on a cloud server without acces to the filesystem, so I need to do this in the DB (I can do this in the app server, I'm asking if is possible to do this in the DB directly).
In the app server, I load a datatable, walk his fieldnames and datatypes and build a insert... I wonder if exist a way to do the same in the DB...
I am not entirely sure whether that helps, but you can use simple ETL tool like 'Pentaho Kettle'. I used it once for a similar function and it did not take me more than 10 min. You can also schedule the jobs. I am not sure whether it is supported in database level.
Thanks,
Shankar

export/import all the information of a table

For a mandatory assignment of a DB2 class I'm asked to write o procedure to export "export information about all xxx, delete all xxx and import the information again." where xxx is my table.
This procedure has to be as efficiently as possible.
I'm quite stuck here, quite naively I see two options
1) write a select * from xxx; drop ...; insert; using python or something
2) using some export/import utility of db2
But I can be totally wrong, suggestions?
what I've noticed is that there are not integrity constraints.
You can do that via "export/load/set integrity". I think it is the best way if you execute that in the server.
If you use python, you will have to use a odbc driver or similar to get the data, processes, etc.
If you use python just to execute the commands, it is ok, finally, it is just a call to the database.
If you execute the process in other machine, the net use is increased, and the performance is lower.
Using import, it is just like an "insert" per row in the file which uses a lot of transaction log. Instead, the load command, puts the data diretly in the tablespace and then check the referential integrity (faster process)
Finally, if you want to extract the information very fast, you can buy the IBM InfoSphere® Optim™ High Performance Unload for DB2 for Linux, UNIX and Windows
I have had a similar task before.
The solution is simple and sweet:
A simple export to csv; and once the data has been exported, the main thing is to TRUNCATE the table with your logs being disabled and then load the data back into the table.
EXPORT TO <FileName>.CSV OF DEL SELECT * FROM <TableName>;
ALTER TABLE <TableName> ACTIVATE NOT LOGGED INITIALLY WITH EMPTY TABLE;
LOAD FROM "./<FileName>.CSV" OF DEL INSERT INTO <TableName>;