Generic string trimming trigger for Postgresql - postgresql

Problem:
One of the owners of the company that I work for has direct database access. He uses Navicat on a windows notebook. Apparently, it has a feature that he likes where he can import data from Excel. The problem is that text fields often (or maybe always) end up with a \r\n at the end of them. Which can lead to display, reporting and filtering issues. I've been asked to clean this up and to stop him from doing it.
I know I can just add a trigger to each table that will do something like:
NEW.customer_name := regexp_replace(NEW.customer_name, '\r\n', '', 'g');
However, I would prefer to not write a separate trigger function for each table that he has access to (there are over 100). My idea was to just write a generic function and then pass in an array of column names I want corrected via the TG_ARGV[] argument.
Is there a way to update a triggers NEW record dynamically based on the TG_ARGV array?
Details:
I'm using PostgreSQL 9.6.6 on x86_64-pc-linux-gnu

There is no native means to dynamically access the columns of the new record in a plpgsql trigger function. The only way I know is to convert the record to jsonb, modify it and convert it back to record using jsonb_populate_record():
create or replace function a_trigger()
returns trigger language plpgsql as $$
declare
j jsonb = to_jsonb(new);
arg text;
begin
foreach arg in array tg_argv loop
if j->>arg is not null then
j = j || jsonb_build_object(arg, regexp_replace(j->>arg, e'\r\n', '', 'g'));
end if;
end loop;
new = jsonb_populate_record(new, j);
return new;
end;
$$;
The case is much simpler if you can use plpython:
create or replace function a_trigger()
returns trigger language plpython3u as $$
import re
new = TD["new"]
for col in TD["args"]:
new[col] = re.sub(r"\r\n", "", new[col])
return "MODIFY"
$$;

Related

postgresql CONCAT function error when use in a trigger

This might be a stupid question but pardon me, I'm trying to convert one of my MariaDB database into a PostgreSQL database. Here I'm getting an error while executing this function.
I cannot find what's wrong here,
create function tg_prodcut_insert()
returns trigger as '
BEGIN
SET NEW.id = CONCAT(1, LPAD(INSERT INTO product_seq VALUES (NULL) returning id, 6, 0));
END;
' LANGUAGE 'plpgsql';
Error is pointing to the 1 in CONCAT method, The type of id I'm trying to SET is char(7)
EDIT
I also tried this, this won't work either,
create function tg_orders_insert()
returns trigger as '
BEGIN
INSERT INTO order_seq VALUES (NULL);
SET NEW.id = CONCAT('1', LPAD(LAST_INSERT_ID(), 6, 0));
END;
' LANGUAGE 'plpgsql';
Thanks in advance.
It seems you are trying to simulate some kind of sequence with that code by inserting into a table and then getting the auto_increment value from that.
This can be done much more efficiently using a sequence in Postgres.
The error you get also isn't caused by the concat() function but because you are using the wrong syntax.
Value assignment is done using := in PL/pgSQL.
And there is also no last_insert_id() function in Postgres. To get the next value from a sequence use nextval(), to get the most recently generated value, you can use lastval() but that's not necessary here.
create sequence product_id_seq;
create function tg_product_insert()
returns trigger as
$$
BEGIN
NEW.id := concat('ORD', to_char(nextval('product_id_seq'), 'FM00000000'));
return new;
END;
$$
LANGUAGE plpgsql;
you will need to create a before trigger for that to work:
create trigger product_seq_trigger
before insert on product
for each row
execute procedure tg_product_insert();
Online example
But it would be a lot more efficient to switch to a proper identity column instead and get rid of the trigger.

Postgresql - Function to prevent UPDATE from occurring

Using PostgreSQL 11.6. I want to prevent an UPDATE to occur on a given column, if a different column data meets certain criteria. I figure the best way is via an Event Trigger, before update.
Goal: if column 'sysdescr' = 'no_response' then do NOT update column 'snmp_community'.
What I tried in my function below is to skip/pass on a given update when that criteria is met. But it is preventing any updates, even when the criteria doesn't match.
CREATE OR REPLACE FUNCTION public.validate_sysdescr()
RETURNS trigger
LANGUAGE plpgsql
AS $function$BEGIN
IF NEW.sysdescr = 'no_response' THEN
RETURN NULL;
ELSE
RETURN NEW;
END IF;
END;
$function$;
Note: I was thinking using some type of 'skip' action may be best, to make the function more re-usable. But if I need to call out the specific column to not update (snmp_community) that's fine.
Change the procedure to:
CREATE OR REPLACE FUNCTION public.validate_sysdescr()
RETURNS trigger
LANGUAGE plpgsql
AS $function$BEGIN
IF NEW.sysdescr = 'no_response' THEN
NEW.snmp_community = OLD.snmp_community ;
END IF;
RETURN NEW;
END;
$function$;
And associate it to a common on update trigger:
CREATE TRIGGER validate_sysdescr_trg BEFORE UPDATE ON <YOUR_TABLE>
FOR EACH ROW
EXECUTE PROCEDURE public.validate_sysdescr();

How to migrate oracle's pipelined function into PostgreSQL

As I am newbie to plpgSQL,
I stuck while migrating a Oracle query into PostgreSQL.
Oracle query:
create or replace FUNCTION employee_all_case(
p_ugr_id IN integer,
p_case_type_id IN integer
)
RETURN number_tab_t PIPELINED
-- LANGUAGE 'plpgsql'
-- COST 100
-- VOLATILE
-- AS $$
-- DECLARE
is
l_user_id NUMBER;
l_account_id NUMBER;
BEGIN
l_user_id := p_ugr_id;
l_account_id := p_case_type_id;
FOR cases IN
(SELECT ccase.case_id, ccase.employee_id
FROM ct_case ccase
INNER JOIN ct_case_type ctype
ON (ccase.case_type_id=ctype.case_type_id)
WHERE ccase.employee_id = l_user_id)
LOOP
IF cases.employee_id IS NOT NULL THEN
PIPE ROW (cases.case_id);
END IF;
END LOOP;
RETURN;
END;
--$$
When I execute this function then I get the following result
select * from table(select employee_all_case(14533,1190) from dual)
My question here is: I really do not understand the pipelined function and how can I obtain the same result in PostgreSQL as Oracle query ?
Please help.
Thank you guys, your solution was very helpful.
I found the desire result:
-- select * from employee_all_case(14533,1190);
-- drop function employee_all_case
create or replace FUNCTION employee_all_case(p_ugr_id IN integer ,p_case_type_id IN integer)
returns table (case_id double precision)
-- PIPELINED
LANGUAGE 'plpgsql'
COST 100
VOLATILE
AS $$
DECLARE
-- is
l_user_id integer;
l_account_id integer;
BEGIN
l_user_id := cp_lookup$get_user_id_from_ugr_id(p_ugr_id);
l_account_id := cp_lookup$acctid_from_ugr(p_ugr_id);
RETURN QUERY SELECT ccase.case_id
FROM ct_case ccase
INNER JOIN ct_case_type ctype ON ccase.case_type_id = ctype.case_type_id
WHERE ccase.employee_id = p_ugr_id
and ccase.employee_id IS NOT NULL;
--return NEXT;
END;
$$
You would rewrite that to a set returning function:
Change the return type to
RETURNS SETOF integer
and do away with the PIPELINED.
Change the PIPE ROW statement to
RETURN NEXT cases.case_id;
Of course, you will have to do the obvious syntactic changes, like using integer instead of NUMBER and putting the IN before the parameter name.
But actually, it is quite unnecessary to write a function for that. Doing it in a single SELECT statement would be both simpler and faster.
Pipelined functions are best translated to a simple SQL function returning a table.
Something like this:
create or replace function employee_all_case(p_ugr_id integer, p_case_type_IN integer)
returns table (case_id integer)
as
$$
SELECT ccase.case_id
FROM ct_case ccase
INNER JOIN ct_case_type ctype ON ccase.case_type_id = ctype.case_type_id
WHERE ccase.employee_id = p_ugr_id
and cases.employee_id IS NOT NULL;
$$
language sql;
Note that your sample code did not use the second parameter p_case_type_id.
Usage is also more straightforward:
select *
from employee_all_case(14533,1190);
Before diving into the solution, I will provide some information which will help you to understand better.
So basically PIPELINED came into picture for improving memory allocation at run time.
As you all know collections will occupy space when ever they got created. So the more you use, the more memory will get allocated.
Pipelining negates the need to build huge collections by piping rows out of the function.
saving memory and allowing subsequent processing to start before all the rows are generated.
Pipelined table functions include the PIPELINED clause and use the PIPE ROW call to push rows out of the function as soon as they are created, rather than building up a table collection.
By using Pipelined how memory usage will be optimized?
Well, it's very simple. instead of storing data into an array, just process the data by using pipe row(desired type). This actually returns the row and process the next row.
coming to solution in plpgsql
simple but not preferred while storing large data.
Remove PIPELINED from return declaration and return an array of desired type. something like RETURNS typrec2[].
Where ever you are using pipe row(), add that entry to array and finally return that array.
create a temp table like
CREATE TEMPORARY TABLE temp_table (required fields) ON COMMIT DROP;
and insert data into it. Replace pipe row with insert statement and finally return statement like
return query select * from temp_table
**The best link for understanding PIPELINED in oracle [https://oracle-base.com/articles/misc/pipelined-table-functions]
pretty ordinary for postgres reference [http://manojadinesh.blogspot.com/2011/11/pipelined-in-oracle-as-well-in.html]
Hope this helps some one conceptually.

Recursive function postgres to get the latest ID

I want to use a function in PostgreSQL to get the latest ID related to a history:
CREATE TABLE "tbl_ids" (
"ID" oid,
"Name" text,
"newID" oid
);
After creating this simple table, I have no idea where to start my function, and before you ask: I know about COALESCE()-function, but I'm going to have more then one parent-ID in the future.
CREATE FUNCTION get_lastes_id(ID oid, newID oid) RETURNS oid AS $$
BEGIN
IF new IS NOT NULL THEN
--USE old--
END
IF new IS NULL THEN
get_latest_id(new, "newID")
END
END;
I gotta say it because you'd find out anyway: I'm really new in functions with PostgreSQL and I'm not even sure if this is possible. But assuming COALESCE()-Function also exists it has to be a server-side function I guess.
First, it is not clear what you are asking. oid's are probably not the best type to use primarily because they are an internal type designed for the system libraries and therefore you cannot guarantee they will act the way you expect.
Secondly this seems to me to be a poor choice tools if you want to use recursion to just get the latest. If you want things to perform well, try to think in set operations rather than imparitive algorithms.
If you want a trigger to get the latest (maximum) oid for a name and assign it to "newID" then:
CREATE OR REPLACE FUNCTION set_newID() RETURNS TRIGGER LANGUAGE PLPGSQL AS
$$
DECLARE maxid oid;
BEGIN
IF new."newID" IS NOT NULL THEN
RETURN new; -- do nothing
END IF;
SELECT max("ID") INTO maxid FROM tbl_ids WHERE "Name" = new."Name";
new."newID" = maxid;
RETURN new;
END;
$$;
That works with oids and ints. However it has to select a row from the db on each row modified by the trigger so you will have performance problems with bulk inserts for example.
Oh, and far better to use all lower case so you don't have to quote every identifier.

Postgresql - remove unwanted characters from data stream using stored procedure or triggers?

I have a data stream coming from a machine and this is stored in a Postgresql DB. I need to strip out various unwanted characters and keep both the original result and the new result. eg I have data "34.5 !*" or "17.9 P-" in a field and want to store "34.5" or "17.9" . I was wondering about using a trigger to call a procedure to write the data minus the unwanted characters to a new field...
You can easily do that in a trigger with a regular expression, something like:
create or replace function clean_value()
returns trigger
language plpgsql
AS
$body$
begin
new.clean_column = regexp_replace(new.dirty_column, '[^0-9\.]', '', 'g');
return new;
end;
$body$
/
That will store a "clean" version of the input dirty_column into the column clean_column.