How to clone a RECORD in PostgreSQL - postgresql

I want to loop through a query, but also retain the actual record for the next loop, so I can compare two adjacent rows.
CREATE OR REPLACE FUNCTION public.test ()
RETURNS void AS
$body$
DECLARE
previous RECORD;
actual RECORD;
query TEXT;
isdistinct BOOLEAN;
tablename VARCHAR;
columnname VARCHAR;
firstrow BOOLEAN DEFAULT TRUE;
BEGIN
tablename = 'naplo.esemeny';
columnname = 'esemeny_id';
query = 'SELECT * FROM ' || tablename || ' LIMIT 2';
FOR actual IN EXECUTE query LOOP
--do stuff
--save previous record
IF NOT firstrow THEN
EXECUTE 'SELECT ($1).' || columnname || ' IS DISTINCT FROM ($2).' || columnname
INTO isdistinct USING previous, actual;
RAISE NOTICE 'previous: %', previous.esemeny_id;
RAISE NOTICE 'actual: %', actual.esemeny_id;
RAISE NOTICE 'isdistinct: %', isdistinct;
ELSE
firstrow = false;
END IF;
previous = actual;
END LOOP;
RETURN;
END;
$body$
LANGUAGE 'plpgsql'
VOLATILE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100;
The table:
CREATE TABLE naplo.esemeny (
esemeny_id SERIAL,
felhasznalo_id VARCHAR DEFAULT "current_user"() NOT NULL,
kotesszam VARCHAR(10),
idegen_azonosito INTEGER,
esemenytipus_id VARCHAR(10),
letrehozva TIMESTAMP WITHOUT TIME ZONE DEFAULT now() NOT NULL,
szoveg VARCHAR,
munkalap_id VARCHAR(13),
ajanlat_id INTEGER,
CONSTRAINT esemeny_pkey PRIMARY KEY(esemeny_id),
CONSTRAINT esemeny_fk_esemenytipus FOREIGN KEY (esemenytipus_id)
REFERENCES naplo.esemenytipus(esemenytipus_id)
ON DELETE RESTRICT
ON UPDATE RESTRICT
NOT DEFERRABLE
)
WITH (oids = true);
The code above doesn't work, the following error message is thrown:
ERROR: could not identify column "esemeny_id" in record data type
LINE 1: SELECT ($1).esemeny_id IS DISTINCT FROM ($2).esemeny_id
^
QUERY: SELECT ($1).esemeny_id IS DISTINCT FROM ($2).esemeny_id
CONTEXT: PL/pgSQL function "test" line 18 at EXECUTE statement
LOG: duration: 0.000 ms statement: SET DateStyle TO 'ISO'
What am I missing?
Disclaimer: I know the code doesn't make too much sense, I only created so I can demonstrate the problem.

This does not directly answer your question, and may be of no use at all, since you did not really describe your end goal.
If the end goal is to be able to compare the value of a column in the current row with the value of the same column in the previous row, then you might be much better off using a windowing query:
SELECT actual, previous
FROM (
SELECT mycolumn AS actual,
lag(mycolumn) OVER () AS previous
FROM mytable
ORDER BY somecriteria
) as q
WHERE previous IS NOT NULL
AND actual IS DISTINCT FROM previous
This example prints the rows where the current row is different from the previous row.
Note that I added an ORDER BY clause - it does not make sense to talk about "the previous row" without specifying ordering, otherwise you would get random results.
This is plain SQL, not PlPgSQL, but if you can wrap it in a function if you want to dynamically generate the query.

I am pretty sure, there is a better solution for your actual problem. But to answer the question asked, here is a solution with polymorphic types:
The main problem is that you need well known composite types to work with. the structure of anonymous records is undefined until assigned.
CREATE OR REPLACE FUNCTION public.test (actual anyelement, _col text
, OUT previous anyelement) AS
$func$
DECLARE
isdistinct bool;
BEGIN
FOR actual IN
EXECUTE format('SELECT * FROM %s LIMIT 3', pg_typeof(actual))
LOOP
EXECUTE format('SELECT ($1).%1$I IS DISTINCT FROM ($2).%1$I', _col)
INTO isdistinct
USING previous, actual;
RAISE NOTICE 'previous: %; actual: %; isdistinct: %'
, previous, actual, isdistinct;
previous := actual;
END LOOP;
previous := NULL; -- reset dummy output (optional)
END
$func$ LANGUAGE plpgsql;
Call:
SELECT public.test(NULL::naplo.esemeny, 'esemeny_id')
I am abusing an OUT parameter, since it's not possible to declare additional variables with a polymorphic composite type (at least I have failed repeatedly).
If your column name is stable you can replace the second EXECUTE with a simple expression.
I am running out of time, explanation in these related answers:
Declare variable of composite type in PostgreSQL using %TYPE
Refactor a PL/pgSQL function to return the output of various SELECT queries
Asides:
Don't quote the language name, it's an identifier, not a string.
Do you really need WITH (oids = true) in your table? This is still allowed, but largely deprecated in modern Postgres.

Related

Declare a Table as a variable in a stored procedure?

I am currently working a stored procedure capable of detecting continuity on a specific set of entries..
The specific set of entries is extracted from a sql query
The function takes in two input parameter, first being the table that should be investigated, and the other being the list of ids which should be evaluated.
For every Id I need to investigate every row provided by the select statement.
DROP FUNCTION IF EXISTS GapAndOverlapDetection(table_name text, entity_ids bigint[]);
create or replace function GapAndOverlapDetection ( table_name text, enteity_ids bigint[] )
returns table ( entity_id bigint, valid tsrange, causes_overlap boolean, causes_gap boolean)
as $$
declare
x bigint;
var_r record;
begin
FOREACH x in array $2
loop
EXECUTE format('select entity_id, valid from' ||table_name|| '
where entity_id = '||x||'
and registration #> now()::timestamp
order by valid ASC') INTO result;
for var_r in result
loop
end loop;
end loop ;
end
$$ language plpgsql;
select * from GapAndOverlapDetection('temp_country_registration', '{1,2,3,4}')
I currently get an error in the for statement saying
ERROR: syntax error at or near "$1"
LINE 12: for var_r in select entity_id, valid from $1
You can iterate over the result of the dynamic query directly:
create or replace function gapandoverlapdetection ( table_name text, entity_ids bigint[])
returns table (entity_id bigint, valid tsrange, causes_overlap boolean, causes_gap boolean)
as $$
declare
var_r record;
begin
for var_r in EXECUTE format('select entity_id, valid
from %I
where entity_id = any($1)
and registration > now()::timestamp
order by valid ASC', table_name)
using entity_ids
loop
... do something with var_r
-- return a row for the result
-- this does not end the function
-- it just appends this row to the result
return query
select entity_id, true, false;
end loop;
end
$$ language plpgsql;
The %I injects an identifier into a string and the $1 inside the dynamic SQL is then populated through passing the argument with the using keyword
Firstly, decide whether you want to pass the table's name or oid. If you want to identify the table by name, then the parameter should be of text type and not regclass.
Secondly, if you want the table name to change between executions then you need to execute the SQL statement dynamically with the EXECUTE statement.

Fill a variable with "array_to_string" in a plpgsql trigger function

I'm working with PostgreSQL 9.5.
I'm creating a trigger in PL/pgSQL, that adds a record to a table (synthese_poly) when an INSERT is performed on a second table (operation_poly), with other tables data.
The trigger works well, except for some variables, that are not filled (especially the ones I try to fill with an array_to_string() function).
This is the code:
-- Function: bdtravaux.totablesynth_fn()
-- DROP FUNCTION bdtravaux.totablesynth_fn();
CREATE OR REPLACE FUNCTION bdtravaux.totablesynth_fn()
RETURNS trigger AS
$BODY$
DECLARE
varoperateur varchar;
varchantvol boolean;
BEGIN
IF (TG_OP = 'INSERT') THEN
varsortie_id := NEW.sortie;
varopeid := NEW.operation_id;
--The following « SELECT » queries take data in third-party tables and fill variables, which will be used in the final insertion query.
SELECT array_to_string(array_agg(DISTINCT oper.operateurs),'; ')
INTO varoperateur
FROM bdtravaux.join_operateurs oper INNER JOIN bdtravaux.operation_poly o ON (oper.id_joinop=o.id_oper)
WHERE o.operation_id = varopeid;
SELECT CASE WHEN o.ope_chvol = 0 THEN 'f' ELSE 't' END as opechvol INTO varchantvol
FROM bdtravaux.operation_poly o WHERE o.operation_id = varopeid;
-- «INSERT» query
INSERT INTO bdtravaux.synthese_poly (soperateur, schantvol) SELECT varoperateur, varchantvol;
RAISE NOTICE 'varoperateur value : (%)', varoperateur;
RAISE NOTICE 'varchantvol value : (%)', varchantvol;
END IF;
RETURN NEW;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION bdtravaux.totablesynth_fn()
OWNER TO postgres;
And this is the trigger :
-- Trigger: totablesynth on bdtravaux.operation_poly
-- DROP TRIGGER totablesynth ON bdtravaux.operation_poly;
CREATE TRIGGER totablesynth
AFTER INSERT
ON bdtravaux.operation_poly
FOR EACH ROW
WHEN ((new.chantfini = true))
EXECUTE PROCEDURE bdtravaux.totablesynth_fn();
The varchantvol variable is correctly filled, but varoperateur stays desperately empty (NULL value) (and so on for the corresponding field in the synthese_poly table).
Note:
The SELECT array_to_string(…) ... query itself (launched with pgAdmin, without INTO varoperateur and replacing varopeid with a value) works well, and returns a string.
I tried to change array_to_string() function and variables' data types (using ::varchar or ::text …), nothing works.
Do you see what can happen?
using array_agg
You can replace array_to_string(array_agg(DISTINCT oper.operateurs),'; ') with
string_agg(DISTINCT oper.operateurs,'; ')
And you can use order by to sort the text in the agregate
string_agg(DISTINCT oper.operateurs,'; ' ORDER BY oper.operateurs)
My educated guess: you have a trigger with BEFORE INSERT ON bdtravaux.operation_poly. And operation_id is its serial PK column.
In this case, the query with WHERE o.operation_id = varopeid
(where varopeid has been filled with NEW.operation_id) can never find any rows because the row is not in the table, yet.
array_agg() has no role in this.
Would work with a trigger AFTER INSERT ON bdtravaux.operation_poly. But if id_oper is from the same inserted row, you can just simplify to:
SELECT array_to_string(array_agg(DISTINCT oper.operateurs),'; ')
INTO varoperateur
FROM bdtravaux.join_operateurs oper
WHERE oper.id_joinop = NEW.id_oper;
And keep the BEFORE trigger.
The whole function might be simpler, can probably done with a single query.

PL/pgSQL column name the same as variable

I'm new to plpgsql and I'm trying to create function that will check if a certain value exists in table and if not will add a row.
CREATE OR REPLACE FUNCTION hire(
id_pracownika integer,
imie character varying,
nazwisko character varying,
miasto character varying,
pensja real)
RETURNS TEXT AS
$BODY$
DECLARE
wynik TEXT;
sprawdzenie INT;
BEGIN
sprawdzenie = id_pracownika;
IF EXISTS (SELECT id_pracownika FROM pracownicy WHERE id_pracownika=sprawdzenie) THEN
wynik = "JUZ ISTNIEJE";
RETURN wynik;
ELSE
INSERT INTO pracownicy(id_pracownika,imie,nazwisko,miasto,pensja)
VALUES (id_pracownika,imie,nazwisko,miasto,pensja);
wynik = "OK";
RETURN wynik;
END IF;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
The issue is that I'm getting errors saying that id_pracownika is a column name and a variable.
How to specify that "id_pracownika" in such context refers to column name?
Assuming id_pracownika is The PRIMARY KEY of the table. Or at least defined UNIQUE. (If it's not NOT NULL, NULL is a corner case.)
SELECT or INSERT
Your function is another implementation of "SELECT or INSERT" - a variant of the UPSERT problem, which is more complex in the face of concurrent write load than it might seem. See:
Is SELECT or INSERT in a function prone to race conditions?
With UPSERT in Postgres 9.5 or later
In Postgres 9.5 or later use UPSERT (INSERT ... ON CONFLICT ...) Details in the Postgres Wiki. This new syntax does a clean job:
CREATE OR REPLACE FUNCTION hire(
_id_pracownika integer
, _imie varchar
, _nazwisko varchar
, _miasto varchar
, _pensja real)
RETURNS text
LANGUAGE plpgsql AS
$func$
BEGIN
INSERT INTO pracownicy
( id_pracownika, imie, nazwisko, miasto, pensja)
VALUES (_id_pracownika,_imie,_nazwisko,_miasto,_pensja)
ON CONFLICT DO NOTHING;
IF FOUND THEN
RETURN 'OK';
ELSE
RETURN 'JUZ ISTNIEJE'; -- already exists
END IF;
END
$func$;
About the special variable FOUND:
Why is IS NOT NULL false when checking a row type?
Table-qualify column names to disambiguate where necessary. (You can also prefix function parameters with the function name, but that gets awkward quickly.)
But column names in the target list of an INSERT may not be table-qualified. Those are never ambiguous anyway.
Best avoid ambiguities a priori. Some (including me) like to prefix all function parameters and variables with an underscore.
If you positively need a column name as function parameter name, one way to avoid naming collisions is to use an ALIAS inside the function. One of the rare cases where ALIAS is actually useful.
Or reference function parameters by ordinal position: $1 for id_pracownika in this case.
If all else fails, you can decide what takes precedence by setting #variable_conflict. See:
Naming conflict between function parameter and result of JOIN with USING clause
There is more:
There are intricacies to the RETURNING clause in an UPSERT. See:
How to use RETURNING with ON CONFLICT in PostgreSQL?
String literals (text constants) must be enclosed in single quotes: 'OK', not "OK". See:
Insert text with single quotes in PostgreSQL
Assigning variables is comparatively more expensive than in other programming languages. Keep assignments to a minimum for best performance in plpgsql. Do as much as possible in SQL statements directly.
VOLATILE COST 100 are default decorators for functions. No need to spell those out.
Without UPSERT in Postgres 9.4 or older
...
IF EXISTS (SELECT FROM pracownicy p
WHERE p.id_pracownika = hire.id_pracownika) THEN
RETURN 'JUZ ISTNIEJE';
ELSE
INSERT INTO pracownicy(id_pracownika,imie,nazwisko,miasto,pensja)
VALUES (hire.id_pracownika,hire.imie,hire.nazwisko,hire.miasto,hire.pensja);
RETURN 'OK';
END IF;
...
But there is a tiny race condition between the SELECT and the INSERT, so not bullet-proof under heavy concurrent write-load.
In an EXISTS expression, the SELECT list does not matter. SELECT id_pracownika, SELECT 1, or even SELECT 1/0 - all the same. Just use an empty SELECT list. Only the existence of any qualifying row matters. See:
What is easier to read in EXISTS subqueries?
It is a example tested by me where I use EXECUTE to run a select and put its result in a cursor, using dynamic column names.
1. Create the table:
create table people (
nickname varchar(9),
name varchar(12),
second_name varchar(12),
country varchar(30)
);
2. Create the function:
CREATE OR REPLACE FUNCTION fun_find_people (col_name text, col_value varchar)
RETURNS void AS
$BODY$
DECLARE
local_cursor_p refcursor;
row_from_people RECORD;
BEGIN
open local_cursor_p FOR
EXECUTE 'select * from people where '|| col_name || ' LIKE ''' || col_value || '%'' ';
raise notice 'col_name: %',col_name;
raise notice 'col_value: %',col_value;
LOOP
FETCH local_cursor_p INTO row_from_people; EXIT WHEN NOT FOUND;
raise notice 'row_from_people.nickname: %', row_from_people.nickname ;
raise notice 'row_from_people.name: %', row_from_people.name ;
raise notice 'row_from_people.country: %', row_from_people.country;
END LOOP;
END;
$BODY$ LANGUAGE 'plpgsql'
3. Run the function
select fun_find_people('name', 'Cristian');
select fun_find_people('country', 'Chile');
inspire with Erwin Brandstetter's answers.
CREATE OR REPLACE FUNCTION test_upsert(
_parent_id int,
_some_text text)
RETURNS text
LANGUAGE plpgsql AS
$func$
DECLARE a text;
BEGIN
INSERT INTO parent_tree (parent_id, some_text)
VALUES (_parent_id,_some_text)
ON CONFLICT DO NOTHING
RETURNING 'ok' into a;
return a;
IF NOT FOUND THEN
return 'JUZ ISTNIEJE';
END IF;
END
$func$;
Follow Erwin's answer. I make a variable hold the return type text.
If conflict do nothing then the function will return nothing. For example, already have parent_id = 10, Then the result would be as following:
test_upsert
------------
(1 row)
NOT Sure the usage of:
IF NOT FOUND THEN
return 'JUZ ISTNIEJE';
END IF;

Create a function to get column from multiple tables in PostgreSQL

I'm trying to create a function to get a field value from multiple tables in my database. I made script like this:
CREATE OR REPLACE FUNCTION get_all_changes() RETURNS SETOF RECORD AS
$$
DECLARE
tblname VARCHAR;
tblrow RECORD;
row RECORD;
BEGIN
FOR tblrow IN SELECT tablename FROM pg_catalog.pg_tables WHERE schemaname='public' LOOP /*FOREACH tblname IN ARRAY $1 LOOP*/
RAISE NOTICE 'r: %', tblrow.tablename;
FOR row IN SELECT MAX("lastUpdate") FROM tblrow.tablename LOOP
RETURN NEXT row;
END LOOP;
END LOOP;
END
$$
LANGUAGE 'plpgsql' ;
SELECT get_all_changes();
But it is not working, everytime it shows this error
tblrow.tablename" not defined in line "FOR row IN SELECT MAX("lastUpdate") FROM tblrow.tablename LOOP"
Your inner FOR loop must use the FOR...EXECUTE syntax as shown in the manual:
FOR target IN EXECUTE text_expression [ USING expression [, ... ] ] LOOP
statements
END LOOP [ label ];
In your case something along this line:
FOR row IN EXECUTE 'SELECT MAX("lastUpdate") FROM ' || quote_ident(tblrow.tablename) LOOP
RETURN NEXT row;
END LOOP
The reason for this is explained in the manual somewhere else:
Oftentimes you will want to generate dynamic commands inside your PL/pgSQL functions, that is, commands that will involve different tables or different data types each time they are executed. PL/pgSQL's normal attempts to cache plans for commands (as discussed in Section 39.10.2) will not work in such scenarios. To handle this sort of problem, the EXECUTE statement is provided[...]
Answer to your new question (mislabeled as answer):
This can be much simpler. You do not need to create a table just do define a record type.
If at all, you would better create a type with CREATE TYPE, but that's only efficient if you need the type in multiple places. For just a single function, you can use RETURNS TABLE instead :
CREATE OR REPLACE FUNCTION get_all_changes(text[])
RETURNS TABLE (tablename text
,"lastUpdate" timestamp with time zone
,nums integer) AS
$func$
DECLARE
tblname text;
BEGIN
FOREACH tblname IN ARRAY $1 LOOP
RETURN QUERY EXECUTE format(
$f$SELECT '%I', MAX("lastUpdate"), COUNT(*)::int FROM %1$I
$f$, tblname)
END LOOP;
END
$func$ LANGUAGE plpgsql;
A couple more points:
Use RETURN QUERY EXECUTE instead of the nested loop. Much simpler and faster.
Column aliases would only serve as documentation, those names are discarded in favor of the names declared in the RETURNS clause (directly or indirectly).
Use format() with %I to replace the concatenation with quote_ident() and %1$I to refer to the same parameter another time.
count() usually returns type bigint. Cast the integer, since you defined the column in the return type as such: count(*)::int.
Thanks,
I finally made my script like:
CREATE TABLE IF NOT EXISTS __rsdb_changes (tablename text,"lastUpdate" timestamp with time zone, nums bigint);
CREATE OR REPLACE FUNCTION get_all_changes(varchar[]) RETURNS SETOF __rsdb_changes AS /*TABLE (tablename varchar(40),"lastUpdate" timestamp with time zone, nums integer)*/
$$
DECLARE
tblname VARCHAR;
tblrow RECORD;
row RECORD;
BEGIN
FOREACH tblname IN ARRAY $1 LOOP
/*RAISE NOTICE 'r: %', tblrow.tablename;*/
FOR row IN EXECUTE 'SELECT CONCAT('''|| quote_ident(tblname) ||''') AS tablename, MAX("lastUpdate") AS "lastUpdate",COUNT(*) AS nums FROM ' || quote_ident(tblname) LOOP
/*RAISE NOTICE 'row.tablename: %',row.tablename;*/
/*RAISE NOTICE 'row.lastUpdate: %',row."lastUpdate";*/
/*RAISE NOTICE 'row.nums: %',row.nums;*/
RETURN NEXT row;
END LOOP;
END LOOP;
RETURN;
END
$$
LANGUAGE 'plpgsql' ;
Well, it works. But it seems I can only create a table to define the return structure instead of just RETURNS SETOF RECORD. Am I right?
Thanks again.

a postgres update trigger performs everything else except the actual update

Let's use a test table :
CREATE TABLE labs.date_test
(
pkey int NOT NULL,
val integer,
date timestamp without time zone,
CONSTRAINT date_test_pkey PRIMARY KEY (pkey)
);
I have a trigger function defined as below. It is a function to insert a date into a specified column in the table. Its arguments are the primary key, the name of the date field, and the date to be inserted:
CREATE OR REPLACE FUNCTION tf_set_date()
RETURNS trigger AS
$BODY$
DECLARE
table_name text;
pkey_col text := TG_ARGV[0];
date_col text := TG_ARGV[1];
date_val text := TG_ARGV[2];
BEGIN
table_name := format('%I.%I', TG_TABLE_SCHEMA, TG_TABLE_NAME);
IF TG_NARGS != 3 THEN
RAISE 'Wrong number of args for tf_set_date()'
USING HINT='Check triggers for table ' || table_name;
END IF;
EXECUTE format('UPDATE %s SET %I = %s' ||
' WHERE %I = ($1::text::%s).%I',
table_name, date_col, date_val,
pkey_col, table_name, pkey_col )
USING NEW;
RAISE NOTICE '%', NEW;
RETURN NEW;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
The actual trigger definition is as follows:
CREATE TRIGGER t_set_ready_date
BEFORE UPDATE OF val
ON labs.date_test
FOR EACH ROW
EXECUTE PROCEDURE tf_set_date('pkey', 'date', 'localtimestamp(0)');
Now say I do: INSERT INTO TABLEdate_test(pkey) values(1);`
Then I perform an update as follows:
UPDATE labs.date_test SET val = 1 WHERE pkey = 1;
Now the date gets inserted as expected. But the val field is still NULL. It does not have 1 as one would expect (or rather as I expected).
What am I doing wrong? The RAISE NOTICE in the trigger shows that NEW is still what I expect it to be. Aren't UPDATEs allowed in BEFORE UPDATE triggers? One comment about postgres triggers seems to indicate that original the UPDATE gets overwritten if there is an UPDATE statement in a BEFORE UPDATE trigger. Can someone help me out?
EDIT
I am trying to update the same table that invoked the trigger, and that too the same row which is to be modified by the UPDATE statement that invoked the trigger. I am running Postgresql 9.2
Given all the dynamic table names it isn't entirely clear if this trigger issues an update on the same table that invoked the trigger.
If so: That won't work. You can't UPDATE some_table in a BEFORE trigger on some_table. Or, more strictly, you can, but if you update any row that is affected by the statement that's invoking the trigger results will be unpredictable so it isn't generally a good idea.
Instead, alter the values in NEW directly. You can't do this with dynamic column names, unfortunately; you'll just have to customise the trigger or use an AFTER trigger to do the update after the rows have already been changed.
I am not sure, but your triggers can do recursion calls - it does UPDATE same table from UPDATE trigger. This is usually bad practice, and usually is not good idea to write too generic triggers. But I don't know what you are doing, maybe you need it, but you have to be sure, so you are protected against recursion.
For debugging of triggers is good push to start and to end of function body debug messages. Probably you use GET DIAGNOSTICS statement after EXECUTE statement for information about impact of dynamic SQL
DECLARE
_updated_rows int;
_query text;
BEGIN
RAISE NOTICE 'Start trigger function xxx';
...
_query := format('UPDATE ....);
RAISE NOTICE 'dynamic sql %, %', _query, new;
EXECUTE _query USING new;
GET DIAGNOSICS _updated_rows = ROW_COUNT;
RAISE NOTICE 'Updated rows %', _updated_rows;
...