PostgreSQL recursive parent/child query - postgresql

I'm having some trouble working out the PostgreSQL documentation for recursive queries, and wonder if anyone might be able to offer a suggestion for the following.
Here's the data:
Table "public.subjects"
Column | Type | Collation | Nullable | Default
-------------------+-----------------------------+-----------+----------+--------------------------------------
id | bigint | | not null | nextval('subjects_id_seq'::regclass)
name | character varying | | |
Table "public.subject_associations"
Column | Type | Collation | Nullable | Default
------------+-----------------------------+-----------+----------+--------------------------------------------------
id | bigint | | not null | nextval('subject_associations_id_seq'::regclass)
parent_id | integer | | |
child_id | integer | | |
Here, a "subject" may have many parents and many children. Of course, at the top level a subject has no parents and at the bottom no children. For example:
parent_id | child_id
------------+------------
2 | 3
1 | 4
1 | 3
4 | 8
4 | 5
5 | 6
6 | 7
What I'm looking for is starting with a child_id to get all the ancestors, and with a parent_id, all the descendants. Therefore:
parent_id 1 -> children 3, 4, 5, 6, 7, 8
parent_id 2 -> children 3
child_id 3 -> parents 1, 2
child_id 4 -> parents 1
child_id 7 -> parents 6, 5, 4, 1
Though there seem to be a lot of examples of similar things about I'm having trouble making sense of them, so any suggestions I can try out would be welcome.

To get all children for subject 1, you can use
WITH RECURSIVE c AS (
SELECT 1 AS id
UNION ALL
SELECT sa.child_id
FROM subject_associations AS sa
JOIN c ON c.id = sa. parent_id
)
SELECT id FROM c;

CREATE OR REPLACE FUNCTION func_finddescendants(start_id integer)
RETURNS SETOF subject_associations
AS $$
DECLARE
BEGIN
RETURN QUERY
WITH RECURSIVE t
AS
(
SELECT *
FROM subject_associations sa
WHERE sa.id = start_id
UNION ALL
SELECT next.*
FROM t prev
JOIN subject_associations next ON (next.parentid = prev.id)
)
SELECT * FROM t;
END;
$$ LANGUAGE PLPGSQL;

Try this
--- Table
-- DROP SEQUENCE public.data_id_seq;
CREATE SEQUENCE "data_id_seq"
INCREMENT 1
MINVALUE 1
MAXVALUE 9223372036854775807
START 1
CACHE 1;
ALTER TABLE public.data_id_seq
OWNER TO postgres;
CREATE TABLE public.data
(
id integer NOT NULL DEFAULT nextval('data_id_seq'::regclass),
name character varying(50) NOT NULL,
label character varying(50) NOT NULL,
parent_id integer NOT NULL,
CONSTRAINT data_pkey PRIMARY KEY (id),
CONSTRAINT data_name_parent_id_unique UNIQUE (name, parent_id)
)
WITH (
OIDS=FALSE
);
INSERT INTO public.data(id, name, label, parent_id) VALUES (1,'animal','Animal',0);
INSERT INTO public.data(id, name, label, parent_id) VALUES (5,'birds','Birds',1);
INSERT INTO public.data(id, name, label, parent_id) VALUES (6,'fish','Fish',1);
INSERT INTO public.data(id, name, label, parent_id) VALUES (7,'parrot','Parrot',5);
INSERT INTO public.data(id, name, label, parent_id) VALUES (8,'barb','Barb',6);
--- Function
CREATE OR REPLACE FUNCTION public.get_all_children_of_parent(use_parent integer) RETURNS integer[] AS
$BODY$
DECLARE
process_parents INT4[] := ARRAY[ use_parent ];
children INT4[] := '{}';
new_children INT4[];
BEGIN
WHILE ( array_upper( process_parents, 1 ) IS NOT NULL ) LOOP
new_children := ARRAY( SELECT id FROM data WHERE parent_id = ANY( process_parents ) AND id <> ALL( children ) );
children := children || new_children;
process_parents := new_children;
END LOOP;
RETURN children;
END;
$BODY$
LANGUAGE plpgsql VOLATILE COST 100;
ALTER FUNCTION public.get_all_children_of_parent(integer) OWNER TO postgres
--- Test
SELECT * FROM data WHERE id = any(get_all_children_of_parent(1))
SELECT * FROM data WHERE id = any(get_all_children_of_parent(5))
SELECT * FROM data WHERE id = any(get_all_children_of_parent(6))

Related

Create function in postgresql to update column values from a table with preferred values and aliases

I want to create a function that will update a column of type varchar to a preferred string that is referenced in the column of another table to help me clean this column more iteratively.
CREATE TABLE big_table (
mn_uid NUMERIC PRIMARY KEY,
user_name VARCHAR
);
INSERT INTO big_table VALUES
(1, 'DAVE'),
(2, 'Dave'),
(3, 'david'),
(4, 'Jak'),
(5, 'jack'),
(6, 'Jack'),
(7, 'Grant');
CREATE TABLE nameKey_table (
nk_uid NUMERIC PRIMARY KEY,
correct VARCHAR,
wrong VARCHAR
);
INSERT INTO nameKey_table VALUES
(1, 'David', 'Dave_DAVE_dave_DAVID_david'),
(2, 'Jack', 'JACK_jack_Jak_jak');
I want to perform the following procedure:
UPDATE big_table
SET user_name = (SELECT correct
FROM nameKey_table
WHERE wrong
LIKE '%DAVE%')
WHERE user_name = 'DAVE';
but looped over each user_name in big_table so that I have a function that can do something like this:
UPDATE big_table SET user_name = corrected_name_fn();
Here is my attempt to do something like this but I can't seem to get it to work:
CREATE FUNCTION corrected_name_fn() RETURNS VARCHAR AS $$
DECLARE entry RECORD;
DECLARE correct_name VARCHAR;
BEGIN
FOR entry IN SELECT DISTINCT user_name FROM big_table LOOP
EXECUTE 'SELECT correct
FROM nameKey_table
WHERE wrong
LIKE ''%$1%'''
INTO correct_name
USING entry;
RETURN correct_name;
END LOOP;
END;
$$ LANGUAGE plpgsql;
I want the final output in big_table to be:
| mn_uid | user_name |
| 1 | 'David' |
| 2 | 'David' |
| 3 | 'David' |
| 4 | 'Jack' |
| 5 | 'Jack' |
| 6 | 'Jack' |
| 7 | 'Grant' |
I realize rows 6 and 7 provide two unique cases that I want to build into the function with IF ELSE statements.
If user_name is in nameKey_table.correct, go to next
If user_name is not in nameKey_table.correct or does not match a string in nameKey_table.wrong, leave as is.
Thanks for any help on this!!
It sounds like you want a trigger on the table. Here is my suggestion:
CREATE OR REPLACE FUNCTION tf_fix_name() RETURNS TRIGGER AS
$$
DECLARE
corrected_name TEXT;
BEGIN
SELECT correct INTO corrected_name FROM nameKey_table WHERE expression ~* NEW.user_name;
IF FOUND THEN
NEW.user_name := corrected_name;
END IF;
RETURN NEW;
END;
$$
LANGUAGE plpgsql;
CREATE TEMP TABLE big_table (
mn_uid INT PRIMARY KEY,
user_name TEXT NOT NULL
);
CREATE TRIGGER trigger_fix_name
BEFORE INSERT
ON big_table
FOR EACH ROW
EXECUTE PROCEDURE tf_fix_name();
CREATE TEMP TABLE nameKey_table (
nk_uid INT PRIMARY KEY,
correct TEXT NOT NULL,
expression TEXT NOT NULL
);
INSERT INTO nameKey_table VALUES
(1, 'David', '(dave|david)'),
(2, 'Jack', '(jack|jak)');
INSERT INTO big_table VALUES
(1, 'DAVE'),
(2, 'Dave'),
(3, 'david'),
(4, 'Jak'),
(5, 'jack'),
(6, 'Jack'),
(7, 'Grant');
SELECT * FROM big_table;
+--------+-----------+
| mn_uid | user_name |
+--------+-----------+
| 1 | David |
| 2 | David |
| 3 | David |
| 4 | Jack |
| 5 | Jack |
| 6 | Jack |
| 7 | Grant |
+--------+-----------+
(7 rows)
Note: I think you can do what you want a lot easier with a case insensitive regular expression. And I also changed your primary keys to INTs. Not sure why they are numerics, but it doesn't really change the solutions. My solution was developed and tested on PostgreSQL 9.6.
You don't need a function; you can just update one table from the contents of another table:
UPDATE big_table dst
SET user_name = src.correct
FROM nameKey_table src
WHERE src.wrong LIKE '%' || dst.user_name || '%'
AND dst.user_name <> src.correct -- avoid idempotent updates
;
And if you need performance, dont rely on the LIKE operator, it cannot use indexes for leading %. Instead, use a lookup-table with one entry per row:
CREATE TABLE bad_spell (
correct VARCHAR,
wrong VARCHAR PRIMARY KEY -- This will cause an unique index to be created.
);
INSERT INTO bad_spell VALUES
('David', 'Dave')
,('David', 'DAVE')
,('David', 'dave')
,('David', 'DAVID')
,('David', 'david')
,('Jack', 'JACK')
,('Jack', 'jack')
,('Jack', 'Jak')
,('Jack', 'jak')
;
-- This indexes could be temporary
CREATE INDEX ON big_table(user_name);
-- EXPLAIN
UPDATE big_table dst
SET user_name = src.correct
FROM bad_spell src
WHERE dst.user_name = src.wrong
AND dst.user_name <> src.correct -- avoid idempotent updates
;
SELECT* FROM big_table
;

Use array of IDs to insert records into table if it does not already exist

I have created a postgresql function that takes a comma separated list of ids as input parameter. I then convert this comma separated list into an array.
CREATE FUNCTION myFunction(csvIDs text)
RETURNS void AS $$
DECLARE ids INT[];
BEGIN
ids = string_to_array(csvIDs,',');
-- INSERT INTO tableA
END; $$
LANGUAGE PLPGSQL;
What I want to do now is to INSERT a record for each of the id's(in the array) into TABLE A if the ID does not already exist in table. The new records should have value field set to 0.
Table is created like this
CREATE TABLE TableA (
id int PRIMARY KEY,
value int
);
Is this possible to do?
You can use unnest() function to get each element of your array.
create table tableA (id int);
insert into tableA values(13);
select t.ids
from (select unnest(string_to_array('12,13,14,15', ',')::int[]) ids) t
| ids |
| --: |
| 12 |
| 13 |
| 14 |
| 15 |
Now you can check if ids value exists before insert a new row.
CREATE FUNCTION myFunction(csvIDs text)
RETURNS int AS
$myFunction$
DECLARE
r_count int;
BEGIN
insert into tableA
select t.ids
from (select unnest(string_to_array(csvIDs,',')::int[]) ids) t
where not exists (select 1 from tableA where id = t.ids);
GET DIAGNOSTICS r_count = ROW_COUNT;
return r_count;
END;
$myFunction$
LANGUAGE PLPGSQL;
select myFunction('12,13,14,15') as inserted_rows;
| inserted_rows |
| ------------: |
| 3 |
select * from tableA;
| id |
| -: |
| 13 |
| 12 |
| 14 |
| 15 |
dbfiddle here

How to build a rules table in sql server

There are quite a few business rules which are currently hardcoded within a stored procedure. Wanted to explore the option of setting up a rules table where in we intend to key-in all business rules and based on it execute the stored procedure.
Though the system is little complicated have provided a simple version here.
Create table tblTest
(
TranID int primary key not null,
FName varchar(20) not null,
Age int not null,
Salary money not null,
MaritalStatus char(1) not null
)
Insert into tblTest values (1, 'Alex', 26, '25000.00','Y')
Insert into tblTest values (2, 'Brenda', 25, '14500.00','Y')
Insert into tblTest values (3, 'Peter', 69, '50000.00','N')
Insert into tblTest values (4, 'Paul', 64, '74500.00','Y')
Now to keep the example simple lets assume the business rules to be the following:
1. Age >=25,
2. Age < 65 and
3. Salary > 15K
Create table tblBusRule
(
RuleID int Primary key not null,
ColName varchar(20) not null,
Operator varchar(2) not null,
ColValue varchar(10) not null,
RuleOrder int not null
)
Insert into tblBusRule values (1, 'Age', '>=', '25', 1)
Insert into tblBusRule values (2, 'Age', '<', '65', 2)
Insert into tblBusRule values (3, 'Salary', '>', '15000.00', 3)
The direct query would be something like this which would output the record 1 (Alex) and 4 (Paul) alone.
Select * from tblTest
where
age >=25 and
age < 65 and
salary > '15000.00'
Now how to make this dynamic based on the rules mentioned in tblBusRule?
Using the stuff() with select ... for xml path ('') method of string concatenation and sp_executesql
declare #sql nvarchar(max), #where nvarchar(max);
set #where = stuff((
select ' and '+colname +' '+operator +' ' + colvalue+char(10)
from tblBusRule
order by RuleOrder
for xml path (''), type).value('.','nvarchar(max)')
,1,6,'');
set #sql = 'select * ' +char(10)+'from tblTest'+char(10)+'where '+#where;
select #sql as CodeGenerated;
exec sp_executesql #sql;
rextester demo: http://rextester.com/CGRF91788
returns:
+-------------------------+
| CodeGenerated |
+-------------------------+
| select * |
| from tblTest |
| where Age >= 25 |
| and Age < 65 |
| and Salary > 15000.00 |
+-------------------------+
+--------+-------+-----+------------+---------------+
| TranID | FName | Age | Salary | MaritalStatus |
+--------+-------+-----+------------+---------------+
| 1 | Alex | 26 | 25000,0000 | Y |
| 4 | Paul | 64 | 74500,0000 | Y |
+--------+-------+-----+------------+---------------+
Reference:
- The curse and blessings of dynamic SQL - Erland Sommarskog

Postgres insert trigger fills id

I have a BEFORE trigger which should fill record's root ID which, of course, would point to rootmost entry. I.e:
id | parent_id | root_id
-------------------------
a | null | a
a.1 | a | a
a.1.1 | a.1 | a
b | null | b
If entry's parent_id is null, it would point to record itself.
Question is - inside BEFORE INSERT trigger, if parent_id is null, can I or should I fetch next sequence value, fill id and root_id in order to avoid filling root_id in AFTER trigger?
According to your own definition:
if entry's parent_id is null, it would point to record itself
then you have to do:
if new.parent_id is null then
new.root_id = new.id ;
else
WITH RECURSIVE p (parent_id, level) AS
(
-- Base case
SELECT
parent_id, 0 as level
FROM
t
WHERE
t.id = new.id
UNION ALL
SELECT
t.parent_id, level + 1
FROM
t JOIN p ON t.id = p.parent_id
WHERE
t.parent_id IS NOT NULL
)
SELECT
parent_id
INTO
new.root_id
FROM
p
ORDER BY
level DESC
LIMIT
1 ;
end if ;
RETURN new ;

PostgreSQL: How to create a foreign key on a table with inheritance

I'm trying to create the following tables structure:
create table base_auditoria (
ip_usuario_aud character varying(30) not null,
id_usuarioa_aud integer not null
);
create table pessoa (
id serial not null,
nome character varying not null,
constraint pk__pessoa__id primary key (id)
)
inherits (base_auditoria);
create table pessoa_fisica (
cpf character varying(11) not null,
nome_mae character varying not null
)
inherits (pessoa);
create table usuario (
id serial not null,
id_pessoa integer not null,
email character varying(500) not null,
constraint pk_usuario_id primary key (id)
)
inherits (base_auditoria);
After creating them, insert only one record in the table "usuario" and only one record in the table "pessoa_fisica":
INSERT INTO pessoa_fisica (cpf, nome_mae, nome, ip_usuario_aud, id_usuario_aud)
VALUES ('cpf 001', 'nome mae 001', 'pessoa.nome 001', '255.255.255.255', 0);
INSERT INTO usuario (id_pessoa, email, ip_usuario_aud, id_usuario_aud)
VALUES (
(SELECT id FROM pessoa_fisica WHERE cpf = 'cpf 001' LIMIT 1),
'test#test', '0.0.0.0', 0
);
So I update the two records to contain the correct identifiers that are foreign keys:
UPDATE usuario SET id_usuario_aud = id;
UPDATE pessoa_fisica SET id_usuario_aud = (SELECT id FROM usuario LIMIT 1);
When I run the SELECT, I get the expected answers:
db=# SELECT * FROM usuario;
ip_usuario_aud | id_usuario_aud | id | id_pessoa | email
----------------+----------------+----+-----------+-----------
0.0.0.0 | 1 | 1 | 1 | test#test
(1 registro)
db=# SELECT * FROM pessoa_fisica
ip_usuario_aud | id_usuario_aud | id | nome | cpf | nome_mae
-----------------+----------------+----+-----------------+---------+--------------
255.255.255.255 | 1 | 1 | pessoa.nome 001 | cpf 001 | nome mae 001
(1 registro)
ab=# SELECT * FROM pessoa;
ip_usuario_aud | id_usuario_aud | id | nome
-----------------+----------------+----+-----------------
255.255.255.255 | 1 | 1 | pessoa.nome 001
(1 registro)
ab=# SELECT * FROM ONLY pessoa;
ip_usuario_aud | id_usuario_aud | id | nome
----------------+----------------+----+------
(0 registro)
When I try to add a FOREIGN KEY constraint with the following command, I get the error:
alter table usuario
add constraint fk_1
foreign key (id_pessoa)
references pessoa (id)
match simple on update no action on delete no action;
ERROR: insert or update on table "usuario" violates foreign key constraint
"fk_1" DETAIL: Key (id_pessoa) = (1) is not present in the table
"pessoa".
How can I solve this problem?
Solution 1: Remove inheritance and work with foreign key relationships only;
Solution 2: Create a trigger to copy the table data "pessoa_fisica" to "pessoa" with the command:
CREATE OR REPLACE FUNCTION fn_tg_1() RETURNS trigger AS
$BODY$
BEGIN
IF tg_op = 'INSERT' THEN
INSERT INTO pessoa
SELECT ip_usuario_aud, id_usuario_aud, id, nome
FROM pessoa_fisica
WHERE id = new.id;
ELSIF tg_op = 'UPDATE' THEN
UPDATE pessoa SET
ip_usuario_aud = new.ip_usuario_aud,
id_usuario_aud = new.id_usuario_aud,
nome = new.nome
WHERE id = new.id;
ELSIF tg_op = 'DELETE' THEN
DELETE FROM pessoa WHERE id = new.id;
END IF;
END
$BODY$
LANGUAGE plpgsql VOLATILE COST 100;
CREATE TRIGGER tg_1
BEFORE INSERT OR UPDATE OR DELETE
ON pessoa_fisica FOR EACH ROW
EXECUTE PROCEDURE fn_tg_1();
Thus the foreign key "fk_1" can be created.
Is there any other alternative?