PostgreSQL function-OID: 'my OID' where 'my' is the currently executing user-defined-function - postgresql

This query returns the OID of the function whose name and signature is supplied:
select 'myfunc(signature)'::regprocedure::oid;
But is there something in PostgreSQL plpgsql like a myNameAndSignature() function so we could use dynamic sql to build a statement that gets the OID of the function and then creates a temporary table with the OID appended to the name of the temp table?
The statement to execute dynamically is:
create temp table TT17015
I'm new to PostgreSQL, and maybe there's a better way to handle naming of temporary tables so the functions that use temp tables, and call each other, don't get the error that a particular temp table it is trying to delete is in use elsewhere?

Using the OID of a function does not necessarily prevent a naming conflict. The same function could be run multiple times in the same session.
If you are in need of a unique name, use a SEQUENCE. Run once in your database:
CREATE SEQUENCE tt_seq;
Then, in your plpgsql function or DO statement:
DO
$$
DECLARE
_tbl text := 'tt' || nextval('tt_seq');
BEGIN
EXECUTE 'CREATE TEMP TABLE ' || _tbl || '(id int)';
END
$$
Drawback is that you have to use dynamic SQL for dynamic identifiers. Plain SQL commands do not accept parameters for identifiers.

Related

Declare and return value for DELETE and INSERT

I am trying to remove duplicated data from some of our databases based upon unique id's. All deleted data should be stored in a separate table for auditing purposes. Since it concerns quite some databases and different schemas and tables I wanted to start using variables to reduce chance of errors and the amount of work it will take me.
This is the best example query I could think off, but it doesn't work:
do $$
declare #source_schema varchar := 'my_source_schema';
declare #source_table varchar := 'my_source_table';
declare #target_table varchar := 'my_target_schema' || source_table || '_duplicates'; --target schema and appendix are always the same, source_table is a variable input.
declare #unique_keys varchar := ('1', '2', '3')
begin
select into #target_table
from #source_schema.#source_table
where id in (#unique_keys);
delete from #source_schema.#source_table where export_id in (#unique_keys);
end ;
$$;
The query syntax works with hard-coded values.
Most of the times my variables are perceived as columns or not recognized at all. :(
You need to create and then call a plpgsql procedure with input parameters :
CREATE OR REPLACE PROCEDURE duplicates_suppress
(my_target_schema text, my_source_schema text, my_source_table text, unique_keys text[])
LANGUAGE plpgsql AS
$$
BEGIN
EXECUTE FORMAT(
'WITH list AS (INSERT INTO %1$I.%3$I_duplicates SELECT * FROM %2$I.%3$I WHERE array[id] <# %4$L :: integer[] RETURNING id)
DELETE FROM %2$I.%3$I AS t USING list AS l WHERE t.id = l.id', my_target_schema, my_source_schema, my_source_table, unique_keys :: text) ;
END ;
$$ ;
The procedure duplicates_suppress inserts into my_target_schema.my_source_table || '_duplicates' the rows from my_source_schema.my_source_table whose id is in the array unique_keys and then deletes these rows from the table my_source_schema.my_source_table .
See the test result in dbfiddle.
As has been commented, you need some kind of dynamic SQL. In a FUNCTION, PROCEDURE or a DO statement to do it on the server.
You should be comfortable with PL/pgSQL. Dynamic SQL is no beginners' toy.
Example with a PROCEDURE, like Edouard already suggested. You'll need a FUNCTION instead to wrap it in an outer transaction (like you very well might). See:
When to use stored procedure / user-defined function?
CREATE OR REPLACE PROCEDURE pg_temp.f_archive_dupes(_source_schema text, _source_table text, _unique_keys int[], OUT _row_count int)
LANGUAGE plpgsql AS
$proc$
-- target schema and appendix are always the same, source_table is a variable input
DECLARE
_target_schema CONSTANT text := 's2'; -- hardcoded
_target_table text := _source_table || '_duplicates';
_sql text := format(
'WITH del AS (
DELETE FROM %I.%I
WHERE id = ANY($1)
RETURNING *
)
INSERT INTO %I.%I TABLE del', _source_schema, _source_table
, _target_schema, _target_table);
BEGIN
RAISE NOTICE '%', _sql; -- debug
EXECUTE _sql USING _unique_keys; -- execute
GET DIAGNOSTICS _row_count = ROW_COUNT;
END
$proc$;
Call:
CALL pg_temp.f_archive_dupes('s1', 't1', '{1, 3}', 0);
db<>fiddle here
I made the procedure temporary, since I assume you don't need to keep it permanently. Create it once per database. See:
How to create a temporary function in PostgreSQL?
Passed schema and table names are case-sensitive strings! (Unlike unquoted identifiers in plain SQL.) Either way, be wary of SQL-injection when concatenating SQL dynamically. See:
Are PostgreSQL column names case-sensitive?
Table name as a PostgreSQL function parameter
Made _unique_keys type int[] (array of integer) since your sample values look like integers. Use a the actual data type of your id columns!
The variable _sql holds the query string, so it can easily be debugged before actually executing. Using RAISE NOTICE '%', _sql; for that purpose.
I suggest to comment the EXECUTE line until you are sure.
I made the PROCEDURE return the number of processed rows. You didn't ask for that, but it's typically convenient. At hardly any cost. See:
Dynamic SQL (EXECUTE) as condition for IF statement
Best way to get result count before LIMIT was applied
Last, but not least, use DELETE ... RETURNING * in a data-modifying CTE. Since that has to find rows only once it comes at about half the cost of separate SELECT and DELETE. And it's perfectly safe. If anything goes wrong, the whole transaction is rolled back anyway.
Two separate commands can also run into concurrency issues or race conditions which are ruled out this way, as DELETE implicitly locks the rows to delete. Example:
Replicating data between Postgres DBs
Or you can build the statements in a client program. Like psql, and use \gexec. Example:
Filter column names from existing table for SQL DDL statement
Based on Erwin's answer, minor optimization...
create or replace procedure pg_temp.p_archive_dump
(_source_schema text, _source_table text,
_unique_key int[],_target_schema text)
language plpgsql as
$$
declare
_row_count bigint;
_target_table text := '';
BEGIN
select quote_ident(_source_table) ||'_'|| array_to_string(_unique_key,'_') into _target_table from quote_ident(_source_table);
raise notice 'the deleted table records will store in %.%',_target_schema, _target_table;
execute format('create table %I.%I as select * from %I.%I limit 0',_target_schema, _target_table,_source_schema,_source_table );
execute format('with mm as ( delete from %I.%I where id = any (%L) returning * ) insert into %I.%I table mm'
,_source_schema,_source_table,_unique_key, _target_schema, _target_table);
GET DIAGNOSTICS _row_count = ROW_COUNT;
RAISE notice 'rows influenced, %',_row_count;
end
$$;
--
if your _unique_key is not that much, this solution also create a table for you. Obviously you need to create the target schema yourself.
If your unique_key is too much, you can customize to properly rename the dumped table.
Let's call it.
call pg_temp.p_archive_dump('s1','t1', '{1,2}','s2');
s1 is the source schema, t1 is source table, {1,2} is the unique key you want to extract to the new table. s2 is the target schema

Create function with temporary tables that return a select query using these temp tables

I need to create a function, which returns results of a SELECT query. This SELECT query is a JOIN of few temporary tables created inside this function. Is there any way to create such function? Here is an example (it is very simplified, in reality there are multiple temp tables with long queries):
CREATE OR REPLACE FUNCTION myfunction () RETURNS TABLE (column_a TEXT, column_b TEXT) AS $$
BEGIN
CREATE TEMPORARY TABLE raw_data ON COMMIT DROP
AS
SELECT d.column_a, d2.column_b FROM dummy_data d JOIN dummy_data_2 d2 using (id);
RETURN QUERY (select distinct column_a, column_b from raw_data limit 100);
END;
$$
LANGUAGE 'plpgsql' SECURITY DEFINER
I get error:
[Error] Script lines: 1-19 -------------------------
ERROR: RETURN cannot have a parameter in function returning set;
use RETURN NEXT at or near "QUERY"Position: 237
I apologize in advance for any obvious mistakes, I'm new to this.
Psql version is PostgreSQL 8.2.15 (Greenplum Database 4.3.12.0 build 1)
The most recent version of Greenplum Database (5.0) is based on PostgreSQL 8.3, and it supports the RETURN QUERY syntax. Just tested your function on:
PostgreSQL 8.4devel (Greenplum Database 5.0.0-beta.10+dev.726.gd4a707c762 build dev)
The most probable error this could raise in Postgres:
ERROR: column "foo" specified more than once
Meaning, there is at least one more column name (other than id which is folded to one instance with the USING clause) included in both tables. This would not raise an exception in a plain SQL SELECT which tolerates duplicate output column names. But you cannot create a table with duplicate names.
The problem also applies for Greenplum (like you later declared), which is not Postgres. It was forked from PostgreSQL in 2005 and developed separately. The current Postgres manual hardly applies at all any more. Look to the Greenplum documentation.
And psql is just the standard PostgreSQL interactive terminal program. Obviously you are using the one shipped with PostgreSQL 8.2.15, but the RDBMS is still Greenplum, not Postgres.
Syntax fix (for Postgres, like you first tagged, still relevant):
CREATE OR REPLACE FUNCTION myfunction()
RETURNS TABLE (column_a text, column_b text) AS
$func$
BEGIN
CREATE TEMPORARY TABLE raw_data ON COMMIT DROP AS
SELECT d.column_a, d2.column_b -- explicit SELECT list avoids duplicate column names
FROM dummy_data d
JOIN dummy_data_2 d2 using (id);
RETURN QUERY
SELECT DISTINCT column_a, column_b
FROM raw_data
LIMIT 100;
END
$func$ LANGUAGE plpgsql SECURITY DEFINER;
The example wouldn't need a temp table - unless you access the temp table after the function call in the same transaction (ON COMMIT DROP). Else, a plain SQL function is better in every way. Syntax for Postgres and Greenplum:
CREATE OR REPLACE FUNCTION myfunction(OUT column_a text, OUT column_b text)
RETURNS SETOF record AS
$func$
SELECT DISTINCT d.column_a, d2.column_b
FROM dummy_data d
JOIN dummy_data_2 d2 using (id)
LIMIT 100;
$func$ LANGUAGE plpgsql SECURITY DEFINER;
Not least, it should also work for Greenplum.
The only remaining reason for this function is SECURITY DEFINER. Else you could just use the simple SQL statement (possibly as prepared statement) instead.
RETURN QUERY was added to PL/pgSQL with version 8.3 in 2008, some years after the fork of Greenplum. Might explain your error msg:
ERROR: RETURN cannot have a parameter in function returning set;
use RETURN NEXT at or near "QUERY" Position: 237
Aside: LIMIT without ORDER BY produces arbitrary results. I assume you are aware of that.
If for some reason you actually need temp tables and cannot upgrade to Greenplum 5.0 like A. Scherbaum suggested, you can still make it work in Greenplum 4.3.x (like in Postgres 8.2). Use a FOR loop in combination with RETURN NEXT.
Examples:
plpgsql error "RETURN NEXT cannot have a parameter in function with OUT parameters" in table-returning function
How to use `RETURN NEXT`in PL/pgSQL correctly?
Use of custom return types in a FOR loop in plpgsql

PostgreSQL (9.4) temporary table scope

After some aggravation, I found (IMO) odd behavior when a function calls another. If the outer function creates a temporary table, and the inner function creates a temporary table with the same name, the inner function "wins." Is this intended? FWIW, I am proficient at SQL Server, and temporary tables do not act this way. Temporary tables (#temp or #temp) are scoped to the function. So, an equivalent function (SQL Server stored procedure) would return "7890," not "1234."
drop function if exists inner_function();
drop function if exists outer_function();
create function inner_function()
returns integer
as
$$
begin
drop table if exists tempTable;
create temporary table tempTable (
inner_id int
);
insert into tempTable (inner_id) values (1234);
return 56;
end;
$$
language plpgsql;
create function outer_function()
returns table (
return_id integer
)
as
$$
declare intReturn integer;
begin
drop table if exists tempTable; -- note that inner_function() also declares tempTable
create temporary table tempTable (
outer_id integer
);
insert into tempTable (outer_id) values (7890);
intReturn = inner_function(); -- the inner_function() function recreates tempTable
return query
select * from tempTable; -- returns "1234", not "7890" like I expected
end;
$$
language plpgsql;
select * from outer_function(); -- returns "1234", not "7890" like I expected
There are no problem with this behaviour, in PostgreSQL temp table can have two scopes:
- session (default)
- transaction
To use the "transaction" scope you should use "ON COMMIT DROP" at the end of the CREATE TEMP statement, i.e:
CREATE TEMP TABLE foo(bar INT) ON COMMIT DROP;
Anyway your two functions will be executed in one transaction so when you call the inner_function from the outer_function you'll be in the same transaction and PostgreSQL will detect that "tempTable" already exists in the current session and will drop it in "inner_function" and create again...
Is this intended?
Yes, these are tables in the database, similar to permanent tables.
They exist in a special schema, and are automatically dropped at the end of a session or transaction. If you create a temporary table with the same name as a permanent table, then you must prefix the permanent table with its schema name to reference it while the temporary table exists.
If you want to emulate the SQL Server implementation then you might consider using particular prefixes for your temporary tables.

PostgreSQL execute custom queries from string

I am trying to create a set of functions for a PostgreSQL database I have, and I am facing this problem. What I am trying to do is to create a function, which takes as a parameter the name of his column a user wants to change, and the new value they wish to insert.
I thought that the simplest way would be to create the queries as strings internally and execute them (sql injections are not of concern right now).
Searching a bit, I tried to use the EXECUTE command, with no success. Any arrangement of the queries I tried did not work, with or without SET, and even a super simple query shows a syntax error, inside the function code or in the pgadmin sql editor:
EXECUTE "SELECT * FROM user_data.users;";
ERROR: prepared statement "SELECT * FROM user_data.users;" does not exist
********** Error **********
ERROR: prepared statement "SELECT * FROM user_data.users;" does not exist
SQL state: 26000
Any suggestions to solve this?
To alter a column name you could use a function like this.
It takes the table name, column name, and new column name.
CREATE OR REPLACE FUNCTION foo(_t regclass, _ocn text, _ncn text)
RETURNS void AS
$func$
BEGIN
EXECUTE 'ALTER TABLE '|| _t ||'
RENAME COLUMN '|| quote_ident(_ocn) ||' TO '|| quote_ident(_ncn) ||'';
END
$func$ LANGUAGE plpgsql;
To alter the type is a bit more work and may add to the server overheads.
Lets use this table as an example
CREATE TABLE foobar (id serial primary key, name text);
Ref Postgresql Type Conversion
In many cases a user does not need to understand the details of the
type conversion mechanism. However, implicit conversions done by
PostgreSQL can affect the results of a query. When necessary, these
results can be tailored by using explicit type conversion.
Firstly we would have to get the column data type(s) from the information The Information Schema Ref postgresql 9.4
select data_type from information_schema.columns where table_name = 'foobar' and column_name = 'name';
We could cast types as mentioned in type conversion::Table 8-1. Data Types Ref Postgresql 9.4
ALTER TABLE foobar ALTER COLUMN name TYPE varchar(20) USING name::text;
I hope this helps

Dynamic table name for INSERT INTO query

I am trying to figure out how to write an INSERT INTO query with table name and column name of the source as parameter.
For starters I was just trying to parametrize the source table name. I have written the following query. For now I am declaring and assigning the value of the variable tablename directly, but in actual example it would come from some other source/list. The target table has only one column.
CREATE OR REPLACE FUNCTION foo()
RETURNS void AS
$$
DECLARE
tablename text;
BEGIN
tablename := 'Table_1';
EXECUTE 'INSERT INTO "Schemaname"."targettable"
SELECT "Col_A"
FROM "schemaname".'
||quote_ident(tablename);
END
$$ LANGUAGE PLPGSQL;
Although the query runs without any error no changes are reflected at the target table. On running the query I get the following output.
Query OK, 0 rows affected (execution time: 296 ms; total time: 296 ms)
I want the changes to be reflected at the target table. I don't know how to resolve the problem.
Audited code
CREATE OR REPLACE FUNCTION foo()
RETURNS void AS
$func$
DECLARE
_tbl text := 'Table_1'; -- or 'table_1'?
BEGIN
EXECUTE 'INSERT INTO schemaname.targettable(column_name)
SELECT "Col_A"
FROM schemaname.' || quote_ident(_tbl); -- or "Schemaname"?
END
$func$ LANGUAGE plpgsql;
Always use an explicit target list for persisted INSERT statements.
You can assign variables at declare time.
It's a wide-spread folly to use double-quoted identifiers to preserve otherwise illegal spelling. You have to keep double-quoting the name for the rest of its existence. One or more of those errors seem to have crept into your code: "Schemaname" or "schemaname"? Table_1 or "Table_1"?
Are PostgreSQL column names case-sensitive?
When you provide an identifier like a table name as text parameter and escape it with quote_ident(), it is case sensitive!
Identifiers in SQL code are cast to lower case unless double-quoted. But quote-ident() (which you must use to defend against SQL injection) preserves the spelling you provide with double-quotes where necessary.
Function with parameter
CREATE OR REPLACE FUNCTION foo(_tbl text)
RETURNS void AS
$func$
BEGIN
EXECUTE 'INSERT INTO schemaname.targettable(column_name)
SELECT "Col_A"
FROM schemaname.' || quote_ident(_tbl);
END
$func$ LANGUAGE plpgsql;
Call:
SELECT foo('tablename'); -- tablename is case sensitive
There are other ways:
Table name as a PostgreSQL function parameter