PostgreSQL Function to dynamically reshape and create tables in loop function - postgresql

I am pretty fresh to PostgreSQL, so please be kind.
I am pretty sure that my problem is that I am mixing plain and dynamic SQL. I have read the relevant documentation but I am not experienced enough to see where I have gone wrong (hoping that my issue is not something more fundamental).
Currently the script is failing with a Query execution error:
SQL Error [42601]: ERROR: syntax error at or near "CREATE"
I intend to use this function to unpivot >9,000 tables (for analysis purposes); fortunately all tables have the same structure.
CREATE OR REPLACE FUNCTION all_schemaTables_unpivot(_schemaName text, _tableName text)
RETURNS void AS
$BODY$
DECLARE
_tbl record;
BEGIN
FOR _tbl IN
SELECT
quote_ident(schemaname) || '.' || quote_ident(tablename) AS fName,
quote_ident(tablename) AS tName
FROM pg_tables
WHERE schemaname = _schemaName
AND tablename LIKE _tableName
LOOP
EXECUTE 'CREATE TABLE ' || _tbl.tName || '_up AS
SELECT region_id, key AS sequential_id, value
FROM (SELECT row_to_json(t.*) AS line, region_id
FROM ' || _tbl.tName || ' AS t) AS r
JOIN LATERAL json_each_text(r.line) ON (key <> "region_id")';
END LOOP;
END;
$BODY$ LANGUAGE plpgsql;
Thanks in advance.

To get the script to work I just needed to fix two things:
I originally did not set appropriate spaces around the query concatenating (thanks #KaushikNayak)
I incorrectly set the quoting values around my dynamic variables (see the 'quote_ident(_tbl.newTableName)' addition. People more expert in PostgreSQL than I will surely have a much cleaner approach - but at least it works!
Moral of the story, sometimes the fix is staring you in the face, but you have been staring at the script for too long! Leave it for awhile and the answer becomes clear.
CREATE OR REPLACE FUNCTION
all_schemaTables_unpivot(_schemaName text, _tableName text)
RETURNS void AS
$BODY$
DECLARE
_tbl record;
BEGIN
FOR _tbl IN
SELECT
quote_ident(schemaname) || '.' || quote_ident(tablename) AS fullNamePath,
quote_ident(tablename) || '_up' AS newTableName
FROM pg_tables
WHERE schemaname = _schemaName
AND tablename LIKE _tableName
LOOP
EXECUTE 'CREATE TABLE '|| quote_ident(_tbl.newTableName) ||' AS
SELECT region_id, key AS sequential_id, value
FROM (SELECT row_to_json(t.*) AS line, region_id
FROM '|| _tbl.fullNamePath ||' AS t) AS r
JOIN LATERAL json_each_text(r.line) ON (key <> "region_id");';
END LOOP;
END;
$BODY$ LANGUAGE plpgsql;

Related

Syntac Error when using EXECUTE command in postgresql Function

I'm trying to write a function which executes a dynamically prepared sql query and return the result as table.
I was refering to the SO answer, it mentioned to use language plpgsql, even after using it I'm getting the same syntax error.
Below is the function code provided.
CREATE OR REPLACE FUNCTION public.execute_test(
ddsmappingids text,
totalnumberofrecords bigint,
skiprecords bigint,
pagesize integer,
cid bigint)
RETURNS TABLE(sampletime timestamp without time zone, jsonstring text, rowscount bigint)
LANGUAGE 'plpgsql'
COST 100
VOLATILE PARALLEL UNSAFE
ROWS 1000
AS $BODY$
DECLARE
table_names text:='dynamictable';
sql_query text;
dsids_array int[];
sub_query text;
BEGIN
if cid>=2 then
select 'dynamictable_'||cid into table_names;
end if;
select string_to_array(ddsmappingids, ',')::int[] into dsids_array;
sub_query:='SELECT * , COUNT(*) over() AS row_count FROM (SELECT start_time, cast(jsonb_object_agg(vw.dataseries_name, ROUND(CAST(o.dbl_value as numeric), 4)) as text) AS DataSeriesValue
FROM ' || quote_ident(table_names) ||' o
join vwdds vw on o.ddmid=vw.id    
where ddmid in (' || array_to_string(dsids_array,',') || ')
and o.dbl_value is not null
GROUP BY start_time
ORDER BY start_time desc
limit ' || totalnumberofrecords || ') t offset '|| skiprecords || ' rows fetch next ' || pagesize || ' rows only;';
RAISE NOTICE 'Temporary table created';
RETURN QUERY Execute sub_query;
END
$BODY$;
Error Message when running the function
EXECUTE command, when i try with some simple query it is working fine, but the query mentioned in the code snippet it is giving syntax error.
Please help me where I'm doing wrong.

How to convert a PL/PgSQL procedure into a dynamic one?

I am trying to write a plpgsql procedure to perform spatial tiling of a postGIS table. I can perform the operation successfully using the following procedure in which the table names are hardcoded. The procedure loops through the tiles in tile_table and for each tile clips the area_table and inserts it into split_table.
CREATE OR REPLACE PROCEDURE splitbytile()
AS $$
DECLARE
tile RECORD;
BEGIN
FOR tile IN
SELECT tid, geom FROM test_tiles ORDER BY tid
LOOP
INSERT INTO split_table (id, areaname, ttid, geom)
SELECT id, areaname, tile.tid,
CASE WHEN st_within(base.geom, tile.geom) THEN st_multi(base.geom)
ELSE st_multi(st_intersection(base.geom, tile.geom)) END as geom
FROM area_table as base
WHERE st_intersects(base.geom, tile.geom);
COMMIT;
END LOOP;
END;
$$ LANGUAGE 'plpgsql';
Having tested this successfully, now I need to convert it to a dynamic procedure where I can provide the table names as parameters. I tried the following partial conversion, using format() for inside of loop:
CREATE OR REPLACE PROCEDURE splitbytile(in_table text, grid_table text, split_table text)
AS $$
DECLARE
tile RECORD;
BEGIN
FOR tile IN
EXECUTE format('SELECT tid, geom FROM %I ORDER BY tid', grid_table)
LOOP
EXECUTE
FORMAT(
'INSERT INTO %1$I (id, areaname, ttid, geom)
SELECT id, areaname, tile.tid,
CASE WHEN st_within(base.geom, tile.geom) THEN st_multi(base.geom)
ELSE st_multi(st_intersection(base.geom, tile.geom)) END as geom
FROM %2$I as base
WHERE st_intersects(base.geom, tile.geom)', split_table, in_table
);
COMMIT;
END LOOP;
END;
$$ LANGUAGE 'plpgsql';
But it throws an error
missing FROM-clause entry for table "tile"
So, how can I convert the procedure to a dynamic one? More specifically, how can I use the record data type (tile) returned by the for loop inside the loop? Note that it works when format is not used.
You can use EXECUTE ... USING to supply parameters to a dynamic query:
EXECUTE
format(
'SELECT r FROM %I WHERE c = $1.val',
table_name
)
INTO result_var
USING record_var;
The first argument to USING will be used for $1, the second for $2 and so on.
See the documentation for details.
Personally I use somehow different way to create dynamic functions. By concatination and execute function. You can also do like this.
CREATE OR REPLACE FUNCTION splitbytile()
RETURNS void AS $$
declare
result1 text;
table_name text := 'test_tiles';
msi text := '+7 9912 231';
msi text := 'Hello world';
code text := 'code_name';
_operator_id integer := 2;
begin
query1 := 'SELECT msisdn from ' || table_name || ' where msisdn = ''' || msi::text ||''';';
query2 := 'INSERT INTO ' || table_name || '(msisdn,usage,body,pr_code,status,sent_date,code_type,operator_id)
VALUES( ''' || msi::text || ''',' || true || ',''' || _body::text || ''',''' || code::text || ''',' || false || ',''' || time_now || ''',' || kod_type || ',' || _operator_id ||');';
execute query1 into result1;
execute query2;
END;
$function$
You just make your query as text then anywhere you want you can execute it. Maybe by checking result1 value inside If statement or smth like that.

SAS to PostgreSQL(PADB) code - summing field if they exists

I'm having a challenge with a piece of code from SAS that I need to convert to SQL.
Usually I'm very good at this but right not I'm facing a new challenge and so far all my ideas to resolve it are failing and I'm not finding the right way to do so.
I need to be able to pick up field dynamically for this request, like if a field has a certain pattern in it's name I need to sum those fields.
my version of PostgreSQL is 8.0.2, PADB 5.3.3.1 78560
So the table may or may not have a field like bas_txn_03cibc_vcl.
I wrote a function that should output ' ' as bas_txn_03cibc_vcl when the field is not found in the information_schema table and use bas_txn_03cibc_vcl if found.
But when I execute the command I get the error that UDF cannot be used on PADB tables.
"ERROR: XX000: User-defined SQL language function "check_if_field_exists(character varying,character varying,character varying)" cannot be used in a query that references PADB tables."
Right now I'm building a new approach using stored procedure but it will limit the use case. Any other idea on how I can select field dynamically?
Function:
CREATE OR REPLACE FUNCTION check_if_field_exists(_schm text, _tbl text, _field text)
RETURNS text AS
$BODY$
DECLARE
_output_ text:= '' as _field;
BEGIN
EXECUTE 'SELECT column_name into : _output_ FROM rdwaeprd.information_schema.columns
where table_schema='''|| _schm||'''
and table_name='''|| _tbl||'''
and column_name='''|| _field||'''
order by table_name,column_name;';
RETURN _output_;
END
$BODY$
LANGUAGE PLPGSQL;
and then I would use it like this
select indiv_id,ae_psamson.check_if_field_exists('ae_psamson','activ_cc', 'tot_txn_03AMX_AMXE') ,tot_txn_03AMX_AMXD
from activ_cc
group by indiv_id,tot_txn_03AMX_AMXD;
Where the function would either return '' as tot_txn_03AMX_AMXE or simply, tot_txn_03AMX_AMXE.... the idea is to make the query not return an error if the field does not exists.
Like I said I need a new function or approach as this one is not working...
I managed to make a function that make it work!
Basically one of the issue what that information schema was using unsupported function in UDF.
This solution works fine:
CREATE OR REPLACE FUNCTION check_if_field_exists(_schm text, _tbl text, _field text)
RETURNS varchar(55) AS
$BODY$
DECLARE
_output_ varchar(55) :=' 0 as '|| _field;
-- name := (SELECT t.name from test_table t where t.id = x);
BEGIN
EXECUTE 'drop table if exists col_name';
EXECUTE 'create table col_name as SELECT att.attname::character varying(128) AS colname
FROM pg_class cl, pg_namespace ns, pg_attribute att
WHERE cl.relnamespace = ns.oid AND cl.oid = att.attrelid AND ns.nspname='''|| _schm ||'''
and cl.relname='''|| _tbl ||'''
and colname like '''|| _field||''''; -- INTO _output_;
select colname from col_name into _output_ ;
if _output_ is null then
_output_ :=' 0 as '|| _field;
end if;
RETURN _output_ ;
END
$BODY$
LANGUAGE PLPGSQL;

SELECTing commands into a temp table to EXECUTE later in PostgreSQL

For some fancy database maintenance for my developer database I'd like to be able to use queries to generate commands to alter the database. The thing is: I'm a complete greenhorn to PostgreSQL. I've made my attempt but have failed colorfully.
So in the end, I would like to have a table with a single column and each row would be a command (or group of commands, depending on the case) that I would think would look something like this...
DO $$
DECLARE
command_entry RECORD;
BEGIN
FOR command_entry IN SELECT * FROM list_of_commands
LOOP
EXECUTE command_entry;
END LOOP;
END;
$$;
Where the table list_of_commands could be populated with something like the following (which in this example would remove all tables from the public schema)...
CREATE TEMP TABLE list_of_commands AS
SELECT 'drop table if exists "' || tablename || '" cascade;'
FROM pg_tables
WHERE schemaname = 'public';
However, with this I get the following error...
ERROR: syntax error at or near ""drop table if exists ""dummy_table"" cascade;""
LINE 1: ("drop table if exists ""dummy_table"" cascade;")
I assume this is a matter of escaping characters, but I'm not entirely sure how to fit that into either A) the population of the table or B) the execution of each row. Does anyone know what I could do to achieve the desired result?
The command_entry variable is of type record while the EXECUTE command expects a string. What is apparently happening is that PostgreSQL turns the record into a double-quoted string, but that messes up your command. Also, your temp table does not use a column name, making things a bit awkward to work with (the column name becomes ?column?), so change both as follows:
CREATE TEMP TABLE list_of_commands AS
SELECT 'drop table if exists public.' || quote_ident(tablename) || ' cascade' AS cmd
FROM pg_tables
WHERE schemaname = 'public';
DO $$
DECLARE
command_entry varchar;
BEGIN
FOR command_entry IN SELECT cmd FROM list_of_commands
LOOP
EXECUTE command_entry;
END LOOP;
END;
$$;
But seeing that you do all of this at session level (temp table, anonymous code block), why not write a stored procedure that performs all of this housekeeping when you are ready to do spring cleaning?
CREATE FUNCTION cleanup() RETURNS void AS $$
BEGIN
FOR tbl IN SELECT tablename FROM pg_tables WHERE schemaname = 'public'
LOOP
EXECUTE 'DROP TABLE IF EXISTS ' || quote_ident(tbl) || ' CASCADE';
END LOOP;
-- More housekeeping jobs
END;
$$ LANGUAGE plpgsql;
This saves a lot of typing: SELECT cleanup();. Any other housekeeping jobs you have you simply add to the stored procedure.
I had trouble with Patrick's answers, so here is an updated version for postgreSQL 10.
CREATE FUNCTION droptables(sn varchar) RETURNS void AS $$
DECLARE
tbl varchar;
BEGIN
FOR tbl IN SELECT tablename FROM pg_tables WHERE schemaname = sn
LOOP
EXECUTE 'DROP TABLE IF EXISTS ' || quote_ident(tbl) || ' CASCADE';
END LOOP;
END;
$$ LANGUAGE plpgsql;
And then "SELECT droptables('public');".

How to add column if not exists on PostgreSQL?

Question is simple. How to add column x to table y, but only when x column doesn't exist ? I found only solution here how to check if column exists.
SELECT column_name
FROM information_schema.columns
WHERE table_name='x' and column_name='y';
With Postgres 9.6 this can be done using the option if not exists
ALTER TABLE table_name ADD COLUMN IF NOT EXISTS column_name INTEGER;
Here's a short-and-sweet version using the "DO" statement:
DO $$
BEGIN
BEGIN
ALTER TABLE <table_name> ADD COLUMN <column_name> <column_type>;
EXCEPTION
WHEN duplicate_column THEN RAISE NOTICE 'column <column_name> already exists in <table_name>.';
END;
END;
$$
You can't pass these as parameters, you'll need to do variable substitution in the string on the client side, but this is a self contained query that only emits a message if the column already exists, adds if it doesn't and will continue to fail on other errors (like an invalid data type).
I don't recommend doing ANY of these methods if these are random strings coming from external sources. No matter what method you use (client-side or server-side dynamic strings executed as queries), it would be a recipe for disaster as it opens you to SQL injection attacks.
Postgres 9.6 added ALTER TABLE tbl ADD COLUMN IF NOT EXISTS column_name.
So this is mostly outdated now. You might use it in older versions, or a variation to check for more than just the column name.
CREATE OR REPLACE function f_add_col(_tbl regclass, _col text, _type regtype)
RETURNS bool
LANGUAGE plpgsql AS
$func$
BEGIN
IF EXISTS (SELECT FROM pg_attribute
WHERE attrelid = _tbl
AND attname = _col
AND NOT attisdropped) THEN
RETURN false;
ELSE
EXECUTE format('ALTER TABLE %s ADD COLUMN %I %s', _tbl, _col, _type);
RETURN true;
END IF;
END
$func$;
Call:
SELECT f_add_col('public.kat', 'pfad1', 'int');
Returns true on success, else false (column already exists).
Raises an exception for invalid table or type name.
Why another version?
This could be done with a DO statement, but DO statements cannot return anything. And if it's for repeated use, I would create a function.
I use the object identifier types regclass and regtype for _tbl and _type which a) prevents SQL injection and b) checks validity of both immediately (cheapest possible way). The column name _col has still to be sanitized for EXECUTE with quote_ident(). See:
Table name as a PostgreSQL function parameter
format() requires Postgres 9.1+. For older versions concatenate manually:
EXECUTE 'ALTER TABLE ' || _tbl || ' ADD COLUMN ' || quote_ident(_col) || ' ' || _type;
You can schema-qualify your table name, but you don't have to.
You can double-quote the identifiers in the function call to preserve camel-case and reserved words (but you shouldn't use any of this anyway).
I query pg_catalog instead of the information_schema. Detailed explanation:
How to check if a table exists in a given schema
Blocks containing an EXCEPTION clause are substantially slower.
This is simpler and faster. The manual:
Tip
A block containing an EXCEPTION clause is significantly more
expensive to enter and exit than a block without one.
Therefore, don't use EXCEPTION without need.
Following select query will return true/false, using EXISTS() function.
EXISTS(): The argument of EXISTS is an arbitrary SELECT statement, or
subquery. The subquery is evaluated to determine whether it returns
any rows. If it returns at least one row, the result of EXISTS is
"true"; if the subquery returns no rows, the result of EXISTS is
"false"
SELECT EXISTS(SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'public'
AND table_name = 'x'
AND column_name = 'y');
and use the following dynamic SQL statement to alter your table
DO
$$
BEGIN
IF NOT EXISTS (SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'public'
AND table_name = 'x'
AND column_name = 'y') THEN
ALTER TABLE x ADD COLUMN y int DEFAULT NULL;
ELSE
RAISE NOTICE 'Already exists';
END IF;
END
$$
For those who use Postgre 9.5+(I believe most of you do), there is a quite simple and clean solution
ALTER TABLE if exists <tablename> add if not exists <columnname> <columntype>
the below function will check the column if exist return appropriate message else it will add the column to the table.
create or replace function addcol(schemaname varchar, tablename varchar, colname varchar, coltype varchar)
returns varchar
language 'plpgsql'
as
$$
declare
col_name varchar ;
begin
execute 'select column_name from information_schema.columns where table_schema = ' ||
quote_literal(schemaname)||' and table_name='|| quote_literal(tablename) || ' and column_name= '|| quote_literal(colname)
into col_name ;
raise info ' the val : % ', col_name;
if(col_name is null ) then
col_name := colname;
execute 'alter table ' ||schemaname|| '.'|| tablename || ' add column '|| colname || ' ' || coltype;
else
col_name := colname ||' Already exist';
end if;
return col_name;
end;
$$
This is basically the solution from sola, but just cleaned up a bit. It's different enough that I didn't just want to "improve" his solution (plus, I sort of think that's rude).
Main difference is that it uses the EXECUTE format. Which I think is a bit cleaner, but I believe means that you must be on PostgresSQL 9.1 or newer.
This has been tested on 9.1 and works. Note: It will raise an error if the schema/table_name/or data_type are invalid. That could "fixed", but might be the correct behavior in many cases.
CREATE OR REPLACE FUNCTION add_column(schema_name TEXT, table_name TEXT,
column_name TEXT, data_type TEXT)
RETURNS BOOLEAN
AS
$BODY$
DECLARE
_tmp text;
BEGIN
EXECUTE format('SELECT COLUMN_NAME FROM information_schema.columns WHERE
table_schema=%L
AND table_name=%L
AND column_name=%L', schema_name, table_name, column_name)
INTO _tmp;
IF _tmp IS NOT NULL THEN
RAISE NOTICE 'Column % already exists in %.%', column_name, schema_name, table_name;
RETURN FALSE;
END IF;
EXECUTE format('ALTER TABLE %I.%I ADD COLUMN %I %s;', schema_name, table_name, column_name, data_type);
RAISE NOTICE 'Column % added to %.%', column_name, schema_name, table_name;
RETURN TRUE;
END;
$BODY$
LANGUAGE 'plpgsql';
usage:
select add_column('public', 'foo', 'bar', 'varchar(30)');
Can be added to migration scripts invoke function and drop when done.
create or replace function patch_column() returns void as
$$
begin
if exists (
select * from information_schema.columns
where table_name='my_table'
and column_name='missing_col'
)
then
raise notice 'missing_col already exists';
else
alter table my_table
add column missing_col varchar;
end if;
end;
$$ language plpgsql;
select patch_column();
drop function if exists patch_column();
In my case, for how it was created reason it is a bit difficult for our migration scripts to cut across different schemas.
To work around this we used an exception that just caught and ignored the error. This also had the nice side effect of being a lot easier to look at.
However, be wary that the other solutions have their own advantages that probably outweigh this solution:
DO $$
BEGIN
BEGIN
ALTER TABLE IF EXISTS bobby_tables RENAME COLUMN "dckx" TO "xkcd";
EXCEPTION
WHEN undefined_column THEN RAISE NOTICE 'Column was already renamed';
END;
END $$;
You can do it by following way.
ALTER TABLE tableName drop column if exists columnName;
ALTER TABLE tableName ADD COLUMN columnName character varying(8);
So it will drop the column if it is already exists. And then add the column to particular table.
Simply check if the query returned a column_name.
If not, execute something like this:
ALTER TABLE x ADD COLUMN y int;
Where you put something useful for 'x' and 'y' and of course a suitable datatype where I used int.