Recursive function postgres to get the latest ID - postgresql

I want to use a function in PostgreSQL to get the latest ID related to a history:
CREATE TABLE "tbl_ids" (
"ID" oid,
"Name" text,
"newID" oid
);
After creating this simple table, I have no idea where to start my function, and before you ask: I know about COALESCE()-function, but I'm going to have more then one parent-ID in the future.
CREATE FUNCTION get_lastes_id(ID oid, newID oid) RETURNS oid AS $$
BEGIN
IF new IS NOT NULL THEN
--USE old--
END
IF new IS NULL THEN
get_latest_id(new, "newID")
END
END;
I gotta say it because you'd find out anyway: I'm really new in functions with PostgreSQL and I'm not even sure if this is possible. But assuming COALESCE()-Function also exists it has to be a server-side function I guess.

First, it is not clear what you are asking. oid's are probably not the best type to use primarily because they are an internal type designed for the system libraries and therefore you cannot guarantee they will act the way you expect.
Secondly this seems to me to be a poor choice tools if you want to use recursion to just get the latest. If you want things to perform well, try to think in set operations rather than imparitive algorithms.
If you want a trigger to get the latest (maximum) oid for a name and assign it to "newID" then:
CREATE OR REPLACE FUNCTION set_newID() RETURNS TRIGGER LANGUAGE PLPGSQL AS
$$
DECLARE maxid oid;
BEGIN
IF new."newID" IS NOT NULL THEN
RETURN new; -- do nothing
END IF;
SELECT max("ID") INTO maxid FROM tbl_ids WHERE "Name" = new."Name";
new."newID" = maxid;
RETURN new;
END;
$$;
That works with oids and ints. However it has to select a row from the db on each row modified by the trigger so you will have performance problems with bulk inserts for example.
Oh, and far better to use all lower case so you don't have to quote every identifier.

Related

How to migrate oracle's pipelined function into PostgreSQL

As I am newbie to plpgSQL,
I stuck while migrating a Oracle query into PostgreSQL.
Oracle query:
create or replace FUNCTION employee_all_case(
p_ugr_id IN integer,
p_case_type_id IN integer
)
RETURN number_tab_t PIPELINED
-- LANGUAGE 'plpgsql'
-- COST 100
-- VOLATILE
-- AS $$
-- DECLARE
is
l_user_id NUMBER;
l_account_id NUMBER;
BEGIN
l_user_id := p_ugr_id;
l_account_id := p_case_type_id;
FOR cases IN
(SELECT ccase.case_id, ccase.employee_id
FROM ct_case ccase
INNER JOIN ct_case_type ctype
ON (ccase.case_type_id=ctype.case_type_id)
WHERE ccase.employee_id = l_user_id)
LOOP
IF cases.employee_id IS NOT NULL THEN
PIPE ROW (cases.case_id);
END IF;
END LOOP;
RETURN;
END;
--$$
When I execute this function then I get the following result
select * from table(select employee_all_case(14533,1190) from dual)
My question here is: I really do not understand the pipelined function and how can I obtain the same result in PostgreSQL as Oracle query ?
Please help.
Thank you guys, your solution was very helpful.
I found the desire result:
-- select * from employee_all_case(14533,1190);
-- drop function employee_all_case
create or replace FUNCTION employee_all_case(p_ugr_id IN integer ,p_case_type_id IN integer)
returns table (case_id double precision)
-- PIPELINED
LANGUAGE 'plpgsql'
COST 100
VOLATILE
AS $$
DECLARE
-- is
l_user_id integer;
l_account_id integer;
BEGIN
l_user_id := cp_lookup$get_user_id_from_ugr_id(p_ugr_id);
l_account_id := cp_lookup$acctid_from_ugr(p_ugr_id);
RETURN QUERY SELECT ccase.case_id
FROM ct_case ccase
INNER JOIN ct_case_type ctype ON ccase.case_type_id = ctype.case_type_id
WHERE ccase.employee_id = p_ugr_id
and ccase.employee_id IS NOT NULL;
--return NEXT;
END;
$$
You would rewrite that to a set returning function:
Change the return type to
RETURNS SETOF integer
and do away with the PIPELINED.
Change the PIPE ROW statement to
RETURN NEXT cases.case_id;
Of course, you will have to do the obvious syntactic changes, like using integer instead of NUMBER and putting the IN before the parameter name.
But actually, it is quite unnecessary to write a function for that. Doing it in a single SELECT statement would be both simpler and faster.
Pipelined functions are best translated to a simple SQL function returning a table.
Something like this:
create or replace function employee_all_case(p_ugr_id integer, p_case_type_IN integer)
returns table (case_id integer)
as
$$
SELECT ccase.case_id
FROM ct_case ccase
INNER JOIN ct_case_type ctype ON ccase.case_type_id = ctype.case_type_id
WHERE ccase.employee_id = p_ugr_id
and cases.employee_id IS NOT NULL;
$$
language sql;
Note that your sample code did not use the second parameter p_case_type_id.
Usage is also more straightforward:
select *
from employee_all_case(14533,1190);
Before diving into the solution, I will provide some information which will help you to understand better.
So basically PIPELINED came into picture for improving memory allocation at run time.
As you all know collections will occupy space when ever they got created. So the more you use, the more memory will get allocated.
Pipelining negates the need to build huge collections by piping rows out of the function.
saving memory and allowing subsequent processing to start before all the rows are generated.
Pipelined table functions include the PIPELINED clause and use the PIPE ROW call to push rows out of the function as soon as they are created, rather than building up a table collection.
By using Pipelined how memory usage will be optimized?
Well, it's very simple. instead of storing data into an array, just process the data by using pipe row(desired type). This actually returns the row and process the next row.
coming to solution in plpgsql
simple but not preferred while storing large data.
Remove PIPELINED from return declaration and return an array of desired type. something like RETURNS typrec2[].
Where ever you are using pipe row(), add that entry to array and finally return that array.
create a temp table like
CREATE TEMPORARY TABLE temp_table (required fields) ON COMMIT DROP;
and insert data into it. Replace pipe row with insert statement and finally return statement like
return query select * from temp_table
**The best link for understanding PIPELINED in oracle [https://oracle-base.com/articles/misc/pipelined-table-functions]
pretty ordinary for postgres reference [http://manojadinesh.blogspot.com/2011/11/pipelined-in-oracle-as-well-in.html]
Hope this helps some one conceptually.

Calling a function for each updated row in postgresql

I have a sql UPDATE statement in a plpgsql function. I now want to call the pg_notify function for each updated row and am uncertain if my solution is the best possibility.
I am not aware of any position in the UPDATE statement itself where I could apply the function. I don't think it is possible in the SET part and if I would apply the function in the WHERE part, it would be applied to each row as it is checked and not only the updated rows, correct?
I therefore thought I could use the RETURNING part for my purposes and designed the function like this:
CREATE OR REPLACE FUNCTION function_name() RETURNS VOID AS $BODY$
BEGIN
UPDATE table1
SET a = TRUE
FROM table2
WHERE table1.b = table2.c
AND <more conditions>
RETURNING pg_notify('notification_name', table1.pk);
END;
$BODY$ LANGUAGE 'plpgsql' VOLATILE;
Unfortunately this gave me an error saying that I am not using or storing the return value of the query anywhere. I therefore tried putting PERFORM in front of the query but this seemed to be syntactically incorrect.
After trying different combinations with PERFORM my ultimate solution is this:
CREATE OR REPLACE FUNCTION function_name() RETURNS VOID AS $BODY$
DECLARE
dev_null INTEGER;
BEGIN
WITH updated AS (
UPDATE table1
SET a = TRUE
FROM table2
WHERE table1.b = table2.c
AND <more conditions>
RETURNING pg_notify('notification_name', table1.pk)
)
SELECT 1 INTO dev_null;
END;
$BODY$ LANGUAGE 'plpgsql' VOLATILE;
This works as it is supposed to, but I feel like there should be a better solution which does not temporarily store a useless result and does not use a useless variable.
Thank you for your help.
** EDIT 1 **
As can be seen in #pnorton 's answer, a trigger would do the trick in most cases. For me, however, it is not applicable as the receiver of the notifications also sometimes updates the table and I do not want to generate notifications in such a case
"I have a sql UPDATE statement in a plpgsql function. I now want to
call the pg_notify function for each updated row "
Ok I might be tempted to use a trigger Eg
CREATE TABLE foobar (id serial primary key, name varchar);
CREATE OR REPLACE FUNCTION notify_trigger() RETURNS trigger AS $$
DECLARE
BEGIN
PERFORM pg_notify('watch_tb_update', TG_TABLE_NAME || ',id,' || NEW.id );
RETURN new;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER foobar_trigger AFTER INSERT ON foobar
FOR EACH ROW EXECUTE PROCEDURE notify_trigger();
LISTEN watch_tb_update;
INSERT into foobar(id, name) values(1,'test_name');
I've tested this and it works fine

PostgreSQL insert or update trigger function volatility category

Assume, i have 2 tables in my DB (postgresql-9.x)
CREATE TABLE FOLDER (
KEY BIGSERIAL PRIMARY KEY,
PATH TEXT,
NAME TEXT
);
CREATE TABLE FOLDERFILE (
FILEID BIGINT,
PATH TEXT,
PATHKEY BIGINT
);
I automatically update FOLDERFILE.PATHKEY from FOLDER.KEY whenever i insert into or update FOLDERFILE:
CREATE OR REPLACE FUNCTION folderfile_fill_pathkey() RETURNS trigger AS $$
DECLARE
pathkey bigint;
changed boolean;
BEGIN
IF tg_op = 'INSERT' THEN
changed := TRUE;
ELSE IF old.FILEID != new.FILEID THEN
changed := TRUE;
END IF;
END IF;
IF changed THEN
SELECT INTO pathkey key FROM FOLDER WHERE PATH = new.path;
IF FOUND THEN
new.pathkey = pathkey;
ELSE
new.pathkey = NULL;
END IF;
END IF;
RETURN new;
END
$$ LANGUAGE plpgsql VOLATILE;
CREATE TRIGGER folderfile_fill_pathkey_trigger AFTER INSERT OR UPDATE
ON FOLDERFILE FOR EACH ROW EXECUTE PROCEDURE fcliplink_fill_pathkey();
So the question is about function folderfile_fill_pathkey() volatility. Documentations says
Any function with side-effects must be labeled VOLATILE
But as far as i understand – this function does not change any data in the tables it rely on, so i can mark this function as IMMUTABLE. It that correct?
Would there be any problem with IMMUTABLE trigger function if I bulk-insert many rows into FOLDERFILE within the same transaction, like:
BEGIN;
INSERT INTO FOLDERFILE ( ... );
...
INSERT INTO FOLDERFILE ( ... );
COMMIT;
Firstly, as #pozs already pointed out, the function definition you have provided is most definitely STABLE rather than IMMUTABLE since it performs database look-ups. This means that the result is not simply derived from the input parameters (as IMMUTABLE would suggest), but also from the data stored in your FOLDER table (which is bound to change). As per the documentation:
STABLE indicates that the function cannot modify the database, and
that within a single table scan it will consistently return the same
result for the same argument values, but that its result could change
across SQL statements. This is the appropriate selection for functions
whose results depend on database lookups, parameter variables (such as
the current time zone), etc.
Secondly, adding stability modifiers (IMMUTABLE/STABLE/VOLATILE) to your trigger functions serves an illustrative purpose at best, since AFAIK PostgreSQL doesn't actually perform any planning that would warrant their use. The following post from the pgsql-hackers mailing list seems to support my claim:
Volatility is a complete no-op for a trigger function anyway, as are
other planner parameters such as cost/rows, because there is no
planning involved in trigger calls.
To sum up: you're probably better off avoiding the stability keywords in your trigger(!) procedures for now, since including them seems to add little to no benefit but entails several unexpected caveats/pitfalls (see the end of #pozs's first comment).

How to kick off triggers without a seemly redundant update statement?

I have a simple table with a primary key, timestamp & count.
I have triggers to auto-update timestamp & count before the update event as is standard.
To execute the triggers, I have to execute an event (e.g. update). Although it works to perform the standard update, I'm not entirely comfortable with it as it seems redundant.
update users set username = 'johndoe' where username = 'johndoe';
Explicitly updating the fields would feel better from an SQL perspective but I'd rather leave the auto-updating to the triggers so the codebase is nicely separated from the schema implementation (for later upgrades).
Is there a way to kick-off all associated triggers on a table row without using update? Or is this an ok solution? Will a future database update refuse the transaction since nothing is changing?
Thanks!
/* update_timestamp function to call from trigger */
create function update_timestamp() returns trigger as $$
begin
NEW.timestamp := current_timestamp;
return NEW;
end;
$$ language plpgsql;
/* update_count function to call from trigger */
create function update_count() returns trigger as $$
begin
NEW.count := OLD.count + 1;
return NEW;
end;
$$ language plpgsql;
/* users table */
create table users(
username character varying(50) not null,
timestamp timestamp not null default current_timestamp,
count bigint not null default 1);
/* timestamp & count triggers */
create trigger users_timestamp_upd before update on users for each row execute procedure update_timestamp();
create trigger users_count_upd before update on users for each row execute procedure update_count();
Last question first:
Will a future database update refuse the transaction since nothing is changing?
No. This is perfectly valid SQL syntax. Refusing it would be going backwards in SQL Standard support, which is highly irregular for any production-ready RDBMS. Furthermore, the standard requires that BEFORE UPDATE triggers run on all affected rows, even if the rows have not actually changed.
Is there a way to kick-off all associated triggers on a table row without using update? Or is this an ok solution?
This is a reasonable solution so far as it goes, but I would call this a code smell. Triggers, in general, are not relational. A purely relational database is easier to reason about. In a purely relational database, you wouldn't be doing something like this. So you should ask yourself whether the triggers were a good idea to begin with. Of course, the answer may well be "yes, because there's no other reasonable way of doing this." But you should actually consider it, rather than just assuming this is the case.
Thanks. Decided to go with a function rather than triggers. Calling it directly from the PHP.
create or replace function add_update_user(varchar) returns void as $$
begin
if exists (select 1 from users where username = $1) then
update users set timestamp = current_timestamp where username = $1;
update users set count = count + 1 where username = $1;
else
insert into users (username) values ($1);
end if;
end;
$$ language plpgsql;
create table users(
username character varying(50) not null,
timestamp timestamp not null default current_timestamp,
count bigint not null default 1);
select add_update_user('testusername');

Is this generic MERGE/UPSERT function for PostgreSQL safe?

I have created a "merge" function which is supposed to execute either an UPDATE or an INSERT query, depending on existing data. Instead of writing an upsert-wrapper for each table (as in most of the available examples), this function takes entire SQL strings. Both of the SQL strings are automatically generated by our application.
The plan is to call the function like this:
-- hypothetical "settings" table, with a primary key of (user_id, setting):
SELECT merge(
$$UPDATE settings SET value = 'x' WHERE user_id = 42 AND setting = 'foo'$$,
$$INSERT INTO settings (user_id, setting, value) VALUES (42, 'foo', 'x')$$
);
Here's the full code of the merge() function:
CREATE OR REPLACE FUNCTION merge (update_sql TEXT, insert_sql TEXT) RETURNS TEXT AS
$func$
DECLARE
max_iterations INTEGER := 10;
i INTEGER := 0;
num_updated INTEGER;
BEGIN
-- usually returns before re-entering the loop
LOOP
-- first try the update
EXECUTE update_sql;
GET DIAGNOSTICS num_updated = ROW_COUNT;
IF num_updated > 0 THEN
RETURN 'UPDATE';
END IF;
-- nothing was updated: try the insert, watching out for concurrent inserts
BEGIN
EXECUTE insert_sql;
RETURN 'INSERT';
EXCEPTION WHEN unique_violation THEN
-- nop; just loop and try again from the top
END;
-- emergency brake
i := i + 1;
IF i >= max_iterations THEN
RAISE EXCEPTION 'merge(): tried looping % times, giving up now.', i;
EXIT;
END IF;
END LOOP;
END;
$func$
LANGUAGE plpgsql;
It appears to work well enough in my tests, but I'm not certain if I haven't missed anything crucial, especially regarding concurrent UPDATE/INSERT/DELETE queries, which may be issued without using this function. Did I overlook anything important?
Among the resources I consulted for this function are:
UPDATE/INSERT example 40.2 in the PostgreSQL manual
Why is UPSERT so complicated?
SO: Insert, on duplicate update (postgresql)
(Edit: one of the goals was to avoid locking the target table.)
The answer to your question depends your the context of how your application(s) will access the database. There are many ways to solve this as nicely discussed in depesz's post you cited by yourself. In addition you might want to also consider using writeable CTEs see here. Also the [question]Insert, on duplicate update in PostgreSQL? has some interesting discussions for your decision making process.